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Description 
TECHNICAL FIELD 

5 The present invention relates to a hyperthermostable protease useful as an industrial enzyme, a gene encoding the 

same and a method for preparation of the enzyme by the genetic engineering. 

BACKGROUND ART 

10 The proteases are the enzymes which cleave peptide bonds in the proteins, and a number of the proteases have 
been found in animals, plants and microorganisms. They are used not only as reagents for research works and medical 
supplies, but also in industrial fields such as additives for detergents, food processing and chemical synthesis utilizing 
the reverse reactions, and it can be said that they are very important enzymes from an industrial viewpoint. For pro- 
teases to be used in industrial fields, since very high physical and chemical stabilities are required, in particular, 
is enzymes having high thermostabilities are preferred to use. At present, proteases predominantly used in industrial 
f ields are those produced by bacteria of the genus Bacillus because they have relatively high thermostability. 

However, enzymes having further superior properties are desired and activities have been attempted to obtain 
enzymes from microorganisms which grow at high temperature, for example, thermophiles of the genus Bacillus. 
On the other hand, a group of microorganisms, named as hyperthermophiles, are well adapted themselves to high 
20 temperature environments and therefore they are expected to be a source supplying various thermostable enzymes. It 
has been known that one of these hyperthermophiles, Pvrococcus furiosus. produces proteases [Appl. Environ. Micro- 
biol., volume 56, page 1992-1998 (1990), FEMS Microbiol. Letters, volume 71, page 17-20 (1990), J. Gen. Microbiol., 
volume 137, page 1193-1199 (1991)]. 

A hyperthermophile belonging to the genus Pvrococcus. Pyrococcus §£. Strain KOD1 is reported to produce a thiol 
25 protease (cysteine protease) [Appl. Environ. Microbiol., volume 60, page 4559-4566 (1994)]. Bacteria belonging to the 
genus Thermococcus. Staphvlothermus and Thermobacteroides. which are also hyperthermophiles, are known to pro- 
duce a protease [Appl. Microbiol. Biotechnol., volume 34, page 715-719 (1991)]. 

OBJECTS OF THE INVENTION 

30 

As the proteases produced by these hyperthermophiles have high thermostabilities, they are expected to be appli- 
cable to new applications to which any known enzymes has hot been utilized. However, the above publication merely 
teach that thermostable protease activities present in cell-free extract or crude enzyme solution obtained from culture 
supernatant, and there is no disclosure about properties of isolated and purified enzymes and the like. Only a protease 

35 produced by strain KOD1 is obtained as the purified form. However, since a cysteine protease has the defect that it eas- 
ily loses the activity by oxidation, it is disadvantageous in the industrial use. In addition, since a cultivation of microor- 
ganisms at high temperature is required to obtain enzymes from these hyperthermophiles, there is a problem in 
industrial production of the enzymes. 

In order to solve the above problems, an object of the present invention is to provide a protease of the hyperther- 

40 mophiles which is advantageous in the industrial use, to isolate a gene encoding a protease of the hyperthermophiles, 
and to provide a method for preparation of a hyperthermostable protease using the gene by the genetic engineering. 

DISCLOSURE OF THE INVENTION 

45 In order to obtain a hyperthermostable protease gene, the present inventors originally tried to purify a protease 
from microbial cells and a culture supernatant of Pvrococcus furiosus DSM3638 so as to determine a partial amino acid 
sequence of the enzyme; However; purification of the protease was very difficult in either cases of using the microbial 
cells or the culture supernatant, and the present inventors failed to obtain such an enzyme sample having sufficient 
purity for determination of its partial amino acid sequence. 

so As a method for cloning a gene for an objective enzyme without any information about a primary structure of the 
enzyme protein, there is an expression cloning method. For example, a pullulanase gene originating in Pvrococcus 
woesei (WO92/02614) has been obtained according to this method. However, in an expression cloning method, a plas- 
mid vector is generally used and, in such case, it is necessary to use restriction enzymes which can cleave an objective 
gene into relatively small DNA fragments so that the fragments can be inserted into the plasmid vector without cleavage 

55 of any internal portion of the objective gene. Therefore, the expression cloning method is not always applicable to clon- 
ing of all kind of enzyme genes. Furthermore, it is necessary to test for an enzyme activity of a large number of clones 
and this operation is complicated. 

The present inventors have attempted to isolate a protease gene by using a cosmid vector which can maintain a 
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hkia ^«mont no soktt instead of a plasmid vector to prepare a cosmid library of PyrococcMS furiosus genome 

high thermostability. ^.p^p of the hvoerthermostable protease deduced from the nucleotide sequence 

enzyme. Thus, since S Sen £ to retain a structure similar to those of 

35 furiosus described above. _ . h t th second protease gene drf- 

g ene was UMrt !£2EEKSS52!I£^ thermostat*,, in add.- 
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acid sequence described in SEQ ID No. 3 of the Sequence Listing or functional equivalents thereof as well as a hyper- 
thermostable protease gene encoding the hyperthermostable proteases, inter alia, a hyperthermostabie protease gene 
having the nucleotide sequence described in SEQ ID No. 4 of the Sequence Listing. Further, a gene which hybridizes 
with this hyperthermostable protease gene and encodes a hyperthermostable protease is also provided. 

s In addition, the third aspect of the present invention provides a hyperthermostable protease having the amino acid 

sequence described in SEQ ID No. 5 of the Sequence Listing or functional equivalents thereof as well as a hyperther- 
mostable protease gene encoding the hyperthermostable proteases, inter alia, a hyperthermostable protease gene 
having the nucleotide sequence described in SEQ ID No. 6 of the Sequence Listing. Further, a gene which hybridizes 
with this hyperthermostable protease gene and encodes a hyperthermostable protease is also provided. 

10 Further, the present invention provides a method for preparation of the hyperthermostable protease which com- 
prises cultivating a transformant containing the hyperthermostable protease gene of the present invention, and collect- 
ing the hyperthermostable protease from the culture. 

As used herein, the term "functional equivalents" means as follows: 

It is known that although, among naturally-occurring proteins, a mutation such as a deletion, an addition, a substi- 
15 tution and the like of one or a few (for example, up to 5% of the whole amino acids) amino add(s) can occur in the amino 
acid sequence thereof due to the modification reaction and the like of the produced proteins in the living body or during 
purification besides the polymorphism or mutation of the genes encoding them, there are proteins, in spite of the muta- 
tion described above, showing a substantially equivalent physiological or biological activity to that of the proteins having 
no mutation. When the proteins have the slight difference in the structures and, nevertheless, the great difference in the 
20 functions thereof is not recognized, they are called functional equivalents. This is true when the above mutations are 
artificially introduced into the amino acid sequence of the proteins and, in this case, further a more variety of mutants 
can be made. For example, a polypeptide in which a certain cysteine residue is replaced with serine residue in the 
amino acid sequence of human interleukin-2 (IL-2) shows the interleukin-2 activity [Science, volume 224, page 1431 
(1984)]. 

25 A product of the gene which is transcribed and translated from the hyperthermostable protease gene of the present 
invention is an enzyme precursor (preproenzyme) containing two regions, one of them is a signal peptide necessary for 
extracellular secretion and the other is a propeptide which is removed upon expression of the activity. When a trans- 
formant to which the above gene has been transferred can cleave this signal peptide, an enzyme precursor (proen- 
zyme) from which the signal peptide has been removed is extracellularly secreted. Further, an active form enzyme from 

30 which the propeptide has been removed is produced by the self-digestion reaction between proenzymes. All of the pre- 
proenzyme, proenzyme and active form enzyme thus obtained from the gene of the present invention are proteins 
which finally have the equivalent function and fail within the scope of "functional equivalents". 

As apparent to those skilled in the art, an appropriate signal peptide may be selected depending upon a host used 
for the expression of a gene of interest. The signal peptide may be removed when the extracellular secretion is not 

35 desired. Therefore, among hyperthermostable protease genes disclosed herein, the genes from which a portion encod- 
ing a signal peptide has been removed, and the genes where the portion is replaced with other nucleotide sequence 
are also within the scope of the present invention in the context that they encode the proteases showing the essentially 
equivalent activity. 

As used herein, a gene which "hybridizes to a hyperthermostable protease gene" refers to a gene which hybridizes 
40 with the hyperthermostable protease gene under stringent conditions, that is, those where incubation is carried out at 
50 °C for 12 to 20 hours in 6 x SSC (1 x SSC represents 0.15M NaCI, 0.01 5M sodium citrate, pH7.0) containing 0.5% 
SDS, 0.1% bovine serum albumin (BSA), 0.1% polyvinylpyrrolidone, 0.1% Ficoll 400 and 0.01% denatured salmon 
sperm DNA. 

45 BRIEF DESCRIPTION OF DRAWINGS 

Fig. 1 is a figure showing a restriction map of a DNA fragment derived from Pyrococcus furiosus contained in the 
plasmid pTPR12 and the plasmid pUBP13. 

Fig. 2 is a figure showing a design of the oligonucleotide PRO-1 F. 
so Fig. 3 is a figure showing a design of the oligonucleotide PRO-2F and PRO-2R. 

Fig. 4 is s figure showing a design of the oligonucleotide PRO-4R. 

Fig. 5 is a restriction map of the plasmid p2F-4R. 

Fig. 6 is a restriction map of the plasmid pTC3. 

Fig. 7 is a restriction map of the plasmid pTCS6. 
55 Fig. 8 is a restriction map of the plasmid pTC4. 

Fig. 9 is a figure showing the procedures for constructing the plasmid pSTC3. 

Fig. 10 is a restriction map of the plasmid pSTC3. 

Fig. 1 1 is a figure comparing the amino acid sequences of the various proteases. 
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Fig. 12 is a continuation of Fig. 11. 

Fig. 13 is a figure showing a restriction map of the Pvrococcus furiosus chromosomal DNA around the protease 
PFUS gene. 

Fig. 14 is a restriction map of the plasmid pSPT1. 
s Fig. 1 5 is a restriction map of the plasmid pSNP1 . 

Fig. 16 is a restriction map of the plasmid pPS1. 
Fig. 17 is a restriction map of the plasmid pNAPSI . 

Fig. 18 is a figure showing the optimum temperature for the enzyme preparation TC-3. 

Fig. 19 is a figure showing the optimum temperature for the enzyme preparation NAPS-1 . 
10 Fig. 20 is a figure showing the optimum pH for the enzyme preparation TC-3. 

Fig. 21 is a figure showing the optimum pH for the enzyme preparation NP-1 . 

Fig. 22 is a figure showing the optimum pH for the enzyme preparation NAPS-1 . 

Fig. 23 is a figure showing the thermostability of the enzyme preparation TC-3. 

Fig. 24 is a figure showing the thermostability of the enzyme preparation NP-1. 
is Fig. 25 is a figure showing the thermostability of the activated enzyme preparation NP-1 . 

Fig. 26 is a figure showing the thermostability of the enzyme preparation NAPS-1. 

Fig. 27 is a figure showing the pH-stability of the enzyme preparation NP-1 . 

Fig. 28 is a figure showing the stability of the enzyme preparation NP-1 in the presence of SDS. 

Fig. 29 is a figure showing the stability of the enzyme preparation NAPS-1 in the presence of SDS. 
20 Fig. 30 is a figure showing the stability of the enzyme preparation NAPS-1 in the presence of acetonitrile. 

Fig. 31 is a f igure showing the stability of the enzyme preparation NAPS-1 in the presence of urea. 

Fig. 32 is a figure showing the stability of the enzyme preparation NAPS-1 in the presence of guanidine hydrochlo- 
ride. 

25 PREFERRED EMBODIMENTS OF THE INVENTION 

The hyperthermostabie protease gene of the present invention can be obtained by screening the gene library of 
hyperthermophiles. As a hyperthermophile, bacteria belonging to the genus Pvrococcus can be used and the gene of 
interest can be obtained by screening a cosmid library of Pvrococcus furiosus genome. 

so For example, Pvrococcus furiosus DSM3638 can be used as Pvrococcus furiosus! and the strain is available from 
Deutsche Sammlung von Mikroorganismen und Zelikulturen GmbH. 

One example of the cosmid libraries of Pvrococcus furiosus genome can be obtained by ligating DNA fragments 
which are obtained by partial digestion of the genomic DNA of Pvrococcus furiosus DSM3638 with a restriction enzyme 
Sau3A1 (manufactured by Takara Shuzo Co., Ltd.), with the triple helix cosmid vector (manufactured by Stratagene), 

35 and packaging the ligated product into a lambda phage particle according to the in vitro packaging method. Then, the 
library is transduced into a suitable Escherichia coli . for example, Escherichia coli DHSaMCR (manufactured by BRL) 
to obtain the transformarrts, followed by cultivation them, collecting the microbial celts, subjecting them to heat treat- 
ment (for example, 100 °C for 10 minutes), sonicating and subjecting them to heat treatment (for example, 100 °C for 
10 minutes) again. The presence or absence of the protease activity in the resulting lysate can be screened by utilizing 

40 the gelatin-containing SDS-polyacrylamide gel electrophoresis. 

In this manner, a cosmid clone containing a hyperthermostabie protease gene expressing a protease which is 
resistant to the above heat treatment can be obtained. 

Further, the cosmid DNA prepared from the obtained cosmid clone can be digested into fragments with a suitable 
restriction enzyme to obtain a recombinant plasmid with an incorporated fragment. Then, a suitable microorganism is 

45 transformed with the plasmid, and the protease activity expressed by the resulting transformant can be examined to 
obtain a recombinant plasmid containing a hyperthermostabie protease gene of interest; 

That is, the cosmid prepared from one of the above cosmid clones is digested with Notl and Pvull (both manufac- 
tured by Takara Shuzo Co., Ltd.) to give an about 7.5kb DNA fragment which can be isolated and inserted between the 
Notl site and the Smal site of the plasmid vector pUCl9 (manufactured by Takara Shuzo Co., Ltd.) into which the Notl 

so linker (manufactured by Takara Shuzo Co., Ltd.) has been introduced. The plasmid was designated the plasmid 
pTPR12 and Escherichia coli JM109 transformed with the plasmid was designated Escherichia coli JM109/pTPR12 
and has been deposited at National Institute of Bioscience and Human-Technology at 1 -1 -3. Higashi, Tsukuba-shi, Iba- 
raki-ken, Japan since May 24, 1994 (original deposit date) as the accession number FERM BP-5103 under Budapest 
Treaty. 

55 The lysate of the Escherichia coli JM109/ipTPR12 shows the similar protease activity to that of the above cosmid 
clone on the gelatin-containing SDS-polyacrylamide gel. 

The nucleotide sequence of the DNA fragment, derived from Pvrococcus furiosus. which was inserted into the plas- . 
mid pTPR12 can be determined by a conventional method, for example, the dideoxy method. The nucleotide sequence 
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of the 4.8kb portion flanked by two Dral sites within the DNA fragment inserted into the plasmid pTPR12 is shown in 
SEQ ID No 7 of the Sequence Listing. The amino acid sequence of a gene product deduced from the nuc eotode 
sequence is shown in SEQ ID No. 8 of the Sequence Listing. Thus, a hyperthermostable protease, the nuc eotde 
sequence and the amino acid sequence of which were revealed, derived from Pvrococcus 

5 protease PFUL. As shown in SEQ ID No. 8 of the Sequence Listing, the protease PFUL is a protease consisting of 1398 
residues and having a high-molecular weight of more than 150 thousands. 

The protease PFUL gene can be expressed using Bacillus subJM as a host. As BjsdHus syfe fe* sybfeliS. 
DB1 04 <£. be used and the strain is the known one described in Gene, volume 83. page 215-233 1989). As a cloning 
vector, the plasmid pUB18-P43 can be used and the plasmid was gifted from Dr. Sui-Lam Wong at Calgary University. 

■ io The plasmid contains the kanamycin resistant gene as a selectable marker ^ • nT pm * 

There is the plasmid pUBP13 where an about 4.8kb DNA fragment obtained by digestion of the plasmid pTPR13 
with Dral has been inserted into the Smal site of the plasmid vector pUB18-P43. In the plasmid. the protease PFUL 
%Z TiStJ Z££m of the P43 promoter [J. Bio.. Chem.. volume 259. page 861 9-8625 
in Bacillus auitfilis. Bacillus suMiS DB104 transformed with the plasmid was designated Baste suttils 
75 DB1<SpUBP13. The lysate of the transformant shows the similar protease activity to that of Escherichia SSl! 

JM1 HowweTonly a trace amount of the protease activity is detected in a culture supernatant of the transformantThis 
is thought to be due to that a molecular weight of the protease PFUL is extremely high and it is not translated effectively 
S si and that a signal sequence encoded by the protease PFUL gene dose ooX functon ^ « Ejate 
20 subtilis There is a possibility that the protease PFUL is a membrane-bound type protease, and the peptide chain on the 

C-terminal side of the protease PFUL may be a region for binding to the cell membrane 

Fio 1 shows a restriction map around the protease PFUL gene on the PYOCOCCUS tosus chromosome, as well 
as a DNA fragment inserted into the plasmid P TPR12 and that inserted into the plasmid pUBP13. In addition, an arrow 
in Fid 1 shows the open reading frame encoding the protease PFUL. 
25 By comparing the amino acid sequence of the protease PFUL represented by SEQ ID. No 8 of ^^enc*b*t- 
ina wtth that of a protease derived from the known microorganism, it is seen that there is homology between the ammo 
acid sequence of the front half portion of the protease PFUL and that of a group of alkaline serine proteases, a repre- 
senWe of which is subtilisin [Protein Engineering, volume 4. page 719-737 (1991)], and tha 
high homology around four amino acid residues which are considered to be important for catalytic activity of the pro- 

^ 'As' it was revealed that regions commonly present in the proteases derived from a mesophile are c p^^j" * e 
amino acid sequence of the protease PFUL produced by the hyperthermophile Pyrococcus furipsus , it is expected that 
Lseregionsarepresent in the samekind of proteases prcxJuced bythehyperthermoph.les other than Pji^so^fua, 

35 ^That is a DNA having the suitable length can be prepared based on the sequence of a portion encoding the amino 
acid sequence of a region having the high homology with that of subtilisin and the like, and the DNA can I e "sedas a 
probe for hybridization or as a primer for gene ampliation such as PCR and the like to screen a hyperthermostable 
protease gene similar to the present enzyme present in various hyperthermophiles. 

In the above method, a DNA fragment containing only a portion of the gene of interest is obtained in some cases. 

40 Upon this, the nucleotide sequence of the resulting DNA fragment is investigated and confirmed that rt .s a po^nof 
the gene of interest and. thereafter, hybridisation can be carried out using the DNA fragment or a part hereof as a probe 
or PCR can be carried out using a primer synthesized based on the nucleotide sequence of the DNA fragment to obtain 

ThfaSove\ 0 ybriSon can be carried out under the following conditions. That is. a membrane to which a DNA is 
45 fixed is incubat ed with a probe suitably labeled at 50 °C for 1 2 to 20 hours in 6 x SSC (1 x SSC represents Q-15M NaCI, 
S^SM sodium citrate. P H 7.0) containing 0.5%. SDS. 0.1% bovine serum albumin (BS A). 01% PO ^^? ne ' 
0 1% Ficoll 400 and 0.01% denatured salmon sperm DNA. After the completion of 'ncubation. the membrane is 
wasned. beginning with washing at 37 -C in 2 * SSC containing 0.5% ^^.^^^^^Z 
of to 0.1 x and a temperature in a range of to 50 "C. until a signal from a probe hybridized to the fixed DNA can be d.s 

so criminated from the background. , , . . ,, caH on tho thue 

In addition, it is apparent to those skilled in the art that a probe and a primer can be made based on the thus 
obtained new hyperthermostable gene to obtain another hyperthermostable protease gene according to the similar 

^ Fig's 2 3 and 4 show the relationship among the amino acid sequences of regions in the amino acid sequence of 
55 the protease PFUL which have high homology with those of subtilisin and the like, the ^^^^L *" f£ 
tease PFUL gene encoding the region, and the nucleotide sequences of the ol.gonucleot.des PRO-1F. PRO-2F. PHW 
2Rarc 7?*0 4R which werl synthesized based thereon. Further, SEQ ID Nos.9. 10, 11 

show the nucleotide sequences of the oligonucleotides PRO-1 F. PRO-2F, PRO-2F and PRO-4R. That is. SEQ ID Nos. 
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9-12 are the nucleotide sequences of one example of the oligonucleotides used for screening the hyperthermostable 

mosomal DNA of the various n yP er * e !™^ Pyrococcus. genus Thsrmococcus . genus Staphy- 

As thehyperthermophi.es. ^ for 
lothermus. genus Thprmohactero.des and the UKe can oe ,ub*> obtained from Deutsche Sammlung von 

Z^VMOBSOa* celer DSM2476 ^^^^^^S^ DNA of Th^rmocgccus cglir 
Mikroorganismen und Zellkulturen QmbH K Wh ^ n n P 5^ 0 C .f a r "^ Heotides PRO-1 F and PRO-2R or a combination of the . 

= ot^^ 

TeTsS^ 

about SSObpDNAamplHiedusmgthed^o^ 

mid vector pUC18 (manufactured by Ta Kara Shuzo < Co ua+ '"^^ Usting snows the nucleotide sequence of the 
2RC2) and plasmid p2F-4R Zl^^Z™™**** therefrom and SEQ ID No. 14 
inserted DNA •W^J^K^ She inserteToNA fragment in the plasmid P 2F-4R and the 

of the Sequence Usting shows the n 4 ucle *' a f ^"rr No 13 of the sequence Listing, the nucleotide sequence from 
amino add sequence deduced therefrom J SEQ , D Na 14 of ^ 

the 1st to the 21st nucleotdes and ^^^^^^^LdUm and that from the 532nd to the 564th 
Sequence Usting. the nucleot.de sequence f ^t^i^onuJeoM^ed in PCR as primers (each conespond- 
nudeotidesarethenu^ 

ing to the oligonucleot.des PRO-1F. PRO 2R. ™£« nne p rotea ses derived from the various microorgan- 

the homology with that of the protease PFUL rlnr^ent^ bv SEQ ID Nos 1 3 and 14 of the Sequence Usting. indicat- 

nucleotides or the amplified DNA fragments obtain f R a J a . & ^ 

One example of the gene librar.es of Thprmococcus SglgL there k a jioraty ^prepa j H DNAfragment> | ig ating 
mosomalDNAofTJ***^ 

taining a gene of interest obtained can be digested with a suitable restriction 
Pr °ren 9 rphageDNApreparedfrom.epha X 

and BamHI (both manufactured by Takara Shuzo Co I td .) an 5tt> D BamHI site of the plasmid vec- 
45 theabout5kbDNAfragmentcanbeiso.atedand.nse^b 

tor pUC119 (manufactured 

- * a * hick - ,ine ~ s the DNA fra9ment 

can be removed according to the following PJ^^* to *e similar procedures described 

tured by Takara Shuzo Co.. Ltd.). southern hybnd.zat.on. seamed ort ■ «*w*nB» ™ 8 ^ 9 ^ DNA fragme nt 
above and it is found that an about 1 .9 kb DNA fragment hybr,d.zes to ^the ^J^^j^ Shuz0 Co.. Ltd.) 
can be isolate and inserted into the Sad site of the plasrrud ^^^^SS^ coli JM109 transformed 
55 to make a recombinant vector. The plas mid wa ^?nated ^ js ^ jn 

with the plasmid was designated Esdje^M eg* JM ^J^SLSStaSS pTasmid vector P UC1 1 8. By determining 
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firmed that a protease gene is present in the DNA fragment SEQ ID No. 15x?f the Sequence Listing shows the nucle- 
otide sequence of the fragment. By comparing the nucleotide sequence with that of the DNA fragment inserted into the 
plasmid p1F-2R (2) or that of the plasmid p2F-4R represented by SEQ ID No. 13 or 14 of the Sequence Listing, it is 
seen that the DNA fragment inserted into the plasmid pTCS6 contains the DNA fragment which is also shared by the 

5 plasmid p2F-4R but lacks a 5' region of the protease gene. 

Like this, the hyperthermostable protease gene, derived from Thermococcus celer. contained in the plasmid 
pTCS6 lacks a portion thereof. However, as apparent to those skilled in the art, a DNA fragment covering the full length 
hyperthermostable protease gene can be obtained by (1) screening the gene library once more, (2) conducting south- 
ern hybridization using a chromosomal DNA, or (3) obtaining a DNA fragment of a 5' upstream region by PGR using a 

10 cassette and a cassette primer (Takara Shuzo Co., Ltd., Genetic Engineering Products Guidance, 1994-1995 edition, 
page 250-251). 

The present inventors selected the method (3). That is, a chromosomal DNA of the Thermococcus celer is com- 
pletely digested with a few restriction enzymes, followed by ligation with a cassette (manufactured by Takara Shuzo Co.. 
Ltd.) which corresponds to the restriction enzyme used. PGR is carried out using this ligation product as a template and 

is the primer TCE6R (SEQ ID No. 16 of the Sequence Listing shows the nucleotide sequence of the primer TCE6R) and 
the cassette primer C1 (manufactured by Takara Shuzo Co., Ltd.) as primers. When the above procedures are carried 
out using the restriction enzyme Hindlll (manufactured by Takara Shuzo Co., Ltd.), an about 1.8 kb DNA fragment is 
amplified, and a DNA fragment of about 1 .5 kb which is obtained by digesting above amplified fragment with Hindlll and 
Sacl can be inserted into between the Hindlll site and the Sacl site of the plasmid vector pUC1 19 to obtain a recom- 

20 binant plasmid. The plasmid was designated plasmid pTC4 and Escherichia coli JM109 transformed with the plasmid 
was designated Escherichia coli JM109frTC4. A restriction map of the plasmid pTC4 is shown in Fig. 8. In Fig. 8, a thick 
solid line designates the DNA fragment inserted into the plasmid vector pUC1 19. 

By determining the nucleotide sequence of the DNA fragment inserted into the plasmid pTC4 by the dideoxy 
method, it can be confirmed that a protease gene is present in the DNA fragment. SEQ ID No. 1 7 of the Sequence List- 

25 ing shows the nucleotide sequence of the fragment. By comparing the amino acid sequence deduced from the nucle- 
otide sequence with those of the various proteases, it is found that the DNA fragment inserted into the plasmid pTC4 
'covers the 5 r region of the hyperthermostable protease gene which the plasmid pTCS6 lacks. By combining the nucle- 
otide sequence with that of the DNA fragment inserted into the plasmid pTCS6 represented by SEQ ID No. 15 of the 
Sequence Listing, the nucleotide sequence of the full length hyperthermostable gene derived from Thermococcus celer 

30 can be identified. The nucleotide sequence of the open reading frame present in the obtained nucleotide sequence is 
shown in SEQ ID Nd 2 of the Sequence Listing and the amino acid sequence deduced from the nucleotide sequence 
is shown in SEQ ID No. 1 , respectively. Thus, the hyperthermostable protease derived from Thermococcus celer, with 
the nucleotide sequence encoding it and the amino acid sequence thereof revealed was designated protease TCES. 
The full length of the protease TCES gene can be reconstituted by combining the inserted DNA fragment of the plasmid 

35 pTC4 and that of the plasmid pTCS6. 

It is contemplated that the protease activity expressed by the gene can be conf irmed by reconstituting the full length 
protease TCES gene from two DNA fragments contained in pTC4 and pTCS6, and inserting this downstream of the lac 
promoter of a plasmid to give an expression plasmid which is introduced into Escherichia coli . However, this method 
affords no transformants into which the expression vector of interest has been introduced, and it is predicted that the 

40 production of a product expressed from the gene is harmful or lethal to Escherichia coli . It is contemplated that, in such 
a case, for example, a protease is extracellularly secreted using Bacillus subtilis as a host to confirm the activity. 

As a host for expressing the protease TCES gene in Bacillus subtilis. the Bacillus subtilis DB1 04 can be used and. 
as a cloning vector, the plasmid pUB18-P43 can be used. 

However, since the host-vector system for Escherichia coli has the advantages that it contains various kind of vec- 

45 tors and transformation can be carried out simply and highly effectively, as many as possible procedures for construct- 
ing an expression vector are desirably, if possible, carried out by using Escherichia coli. That is, in Escherichia coli. an 
optional nucleotide sequence containing a termination codon is inserted between two protease gene fragments derived 
from the plasmid pTC4 and the plasmid pTCS6 so that the full length protease TCES gene is not reconstituted, thus, 
making expression of the gene product impossible and, therefore, the construction of a plasmid can be carried out, 

so Then, this inserted sequence can be removed at the final stage to make the expression plasmid pSTC3 of interest 
shown in Fig. 10. 

The procedures for constructing the plasmid pSTC3 shown in Fig. 9 are explained below. 

First, the about 1 .8 kb Hindlll-Sspl fragment inserted into the plasmid pTCS6 is inserted between the Hindlll site 
and the EcoRV site of the plasmid vector pBR322 (manufactured by Takara Shuzo Co., Ltd.) to make the recombinant 
55 plasmid pBTC5 and, from this plasmid, the DNA fragment between the Hindlll site and the Kpnl site derived from a mul- 
ticloning site of the plasmid vector pUC1 18 and the BamHl site present on the plasmid vector pBR322 are removed to 
make the plasmid pBTCSHKB. 

Then, based on the nucleotide sequence of the protease TCES gene, the primer TCE12 which can introduce the 
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EcoRl site and the BamHI site in front of an initiation codon of the protease TCES, and the primer TCE20R which can 
introduce the Cla! site and a termination codon on the 3' side of only one Sad site present in the nucleotide sequenc 
are synthesized. SEQ ID Nos. 18 and 19 of the Sequence Listing show the nucleotide sequences of the primer TCE12 
and the primer TCE20R, respectively. 

An about 0.9 kb DNA fragment which has been amplified by PCR using a chromosomal DNA of Thermococcus 
celer as a template and using these two primers is digested with EcoRl and Clal (manufactured by Takara Shuzo Co., 
Ltd.), and inserted between the EcoRl site and the Clal site of the plasmid pBTC5HKB to obtain the plasmid pBTCs! 
which has a mutant gene where the nucleotide sequence of 69 bp including a termination codon is inserted into the 
Sact site of the protease TCES gene. 

A ribosome binding site derived from the Bacillus subtilis P43 promoter [J. Biol. Chem., volume 259. page 8619- 
8625 (1 984)] is introduced between the Kpnt site and the BamHI site of the plasmid vector pUC18 to make the plasmid 
pUC-P43. The nucleotide sequences of the synthetic oligonucleotides BS1 and BS2 are shown in SEQ ID Nos. 20 and 
21 of the Sequence Listing, respectively. Then, the plasmid pBTC6 is digested with BamHI and Sphl (both manufac- 
tured by Takara Shuzo Co., Ltd.) to obtain an about 3 kb DNA fragment containing a mutant gene of the protease TCES, 
which is inserted between the BamHI site and the Sphl site of the plasmid p(JC-P43 to construct the plasmid pTC12. ' 

All the above procedures for constructing a piasmid can be carried out using Escherichia as a host. 

The Sad site present in the plasmid vector pUC1 8-P43 used for cloning into Bacillus subtilis is previously removed, 
and an about 3 kb Kpnl-Sphl DNA fragment obtained from the pTC12 can be inserted into between the Kpnl site and 
the Sphl site to make the plasmid pSTC2 using Bacillus subtilis DB1Q4 as a host. The plasmid contains a mutant gene 
of the protease TCES having the P43 promoter and a ribosome binding site sequence on its 5' side. After the plasmid 
pSTC2 is digested with Sacl, and intramolecular ligation is carried out to obtain a recombinant plasmid, from which the 
inserted sequence contained in the Sad site of the above mutant gene has been removed. The recombinant plasmid 
was designated plasmid pSTC3, and Bacillus SU bt ili ? DB104 transformed with the plasmid was designated Bacillus 
subtilis DB104/pSTC3 and has been deposited at National Institute of Bioscience and Human-Technology at 1-1-3, 
Higashi, Tsukuba-shi, IbaraW-ken, Japan under accession number FERM BP-5635 since December 1, 1995 (original 
deposit date) according to Budapest Treaty. The transfbrmant is cultured, and a culture supernatant and an extract from 
the cells were investigated for the protease activity. As a result, the thermostable protease activity is found in both sam- 
ples. 

Fig. 10 shows a restriction map of the plasmid pSTC3. In Fig. 10. a thick solid line designates the DNA fragment 
inserted into the piasmid vector pUB18-P43. 

When the amino acid sequences of the protease PFUL, the protease TCES and subtifisin are aligned so that the 
regions having the homology coincide with each other as shown in Figs. 1 1 and 12, it is seen that the protease PFUL 
has regions which have no homology with the sequence of the protease TCES and that of subtilisin at the C-terminal 
side thereof as well as between the regions having the homology. From this, it is contemplated that, besides the pro- 
tease PFUL, a protease having a smaller molecular weight than that of the protease PFUL, such as the protease TCES 
or subtilisin may be present in Pvrococcus furiosus . In order to search a gene encoding such a protease, southern 
hybridization can be carried out using a chromosomal DNA prepared from Pvrococcus furiosus as a target, and using 
a DNA fragment containing the nucleotide sequence within the protease TCES gene, which encoding the amino acid 
sequence which is well conserved in three proteases, for example, the about 150 bp DNA fragment inserted into the 
plasmid p1F-2R (2), as a probe. Although, since the DNA fragment used for a probe has also homology with the pro- 
tease PFUL gene, the gene fragment is detected as a signal depending upon the hybridization conditions, the position 
of the signal derived from the gene can be previously estimated on each restriction enzyme used for cutting a chromo- 
somal DNA, from the informations on the nucleotide sequence of the protease PFUL gene and the restriction map. 
When some enzymes are used, in addition to the position predicted on the protease PFUL gene, an another signal is 
detected as almost the same level, suggesting the possibility that at least one protease is present in Pvrococcus furio- 
SUS in addition to the protease PFUL 

For isolating a gene corresponding to the above new signal, a portion of the gene is first cloned to prevent the fail- 
ure of isolation of the gene, as in the case of the protease TCES, resulted from the expression of the gene product 
which is harmful or lethal to Escherichia coli. For example, when a chromosomal DNA of Pyrococcus furiosus is 
digested with the restriction enzymes Sad and Spel (both manufactured by Takara Shuzo Co., Ltd.) and the digestion 
products are used to conduct southern hybridization as described above using a fragment of the protease TCES gene 
as a probe, it was revealed that a new signal corresponding to about 0.6 kb, derived from the new gene, was observed 
replacing with a signal corresponding to about 3 kb which was observed in the case of digestion only with Sacl. This 
about 0,6 kb Spel-Sacl fragment encodes an amino acid sequence of at maximum around 200 residues and it can not 
be contemplated to express a protease having the activity. A Pyrococcus furiosus chromosomal DNA digested with 
Sacl and Spel is subjected to agarose gel electrophoresis to recover a DNA fragment corresponding to about 0.6 kb 
from the gel 

Then, the fragment is inserted between the Spel site and the Sacl site of the plasmid vector pBluescript SK(-) 
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(manufactured by Stratagene) and the resulting recombinant plasmid is used to transform Escherichia coli JM109. From 
this transformant, a clone with a fragment of interest incorporated can be obtained by colony hybridization using the 
same probe as that used for the above southern hybridization. Whether a plasmid contained in the resulting clone has 
the sequence encoding a protease or not can be confirmed by conducting PCR using the primers 1FP1. 1FP2, 2RP1 

5 and 2RP2 (the nucleotide sequences of the primers 1FP1, 1 FP2, 2RP1 and 2RP2 are shown in SEQ ID Nos. 22, 23, 
24, and 25 of the Sequence Listing) made based on the amino acid sequence common to the above various proteases, 
or by determining the nucleotide sequence of a DNA fragment inserted into the plasmid prepared from tine clone. The 
plasmid in which the existence of a protease gene is confirmed in this manner was designated the plasmid pSS3. The 
nucleotide sequence of a DNA fragment inserted in the plasmid, and the amino acid sequence deduced therefrom are 

10 shown in SEQ ID No. 26 of the Sequence Listing. 

The. amino acid sequence deduced from the nucleotide sequence of the DNA fragment inserted into the plasmid 
pSS3 is shown to have the homology with the sequences of subtilisin, the protease PFUL, the protease TCES and the 
like. A product of a protease gene different from the protease PFUL, a portion of which was obtained newly from Pvro- 
coccus fiiripsus, was designated protease PFUS. A region encoding a N-terminal side part of the protease, that is, a 

is region 5' of the Spel site, and a region encoding a C-terminal side part, that is, a gene fragment 3' of the above Sad 
site can be obtained by the inverse PCR method. If the restriction enzyme sites in the protease PFUS gene and the 
vicinity thereof in a chromosome are revealed in advance, the inverse PCR can be carried out using an appropriate 
restriction enzyme. The restriction enzyme sites can be revealed by cutting a chromosomal DNA of Pvrococcus furio- 
sus with various restriction enzymes, and conducting southern hybridization using a DNA fragment inserted into the 

20 plasmid pSS3 as a probe. Consequently, it is shown that the Sad site is located on about 3 kb distant 5' side of the Spel 
site and the Xbal site is located on about 5 kb distant 3* side of the Sad site. 

A primer used for the inverse PCR can be design to anneal at around an end of the Spel-Sacl fragment contained 
in the plasmid pSS3. The primers designed to anneal at around the Sad site are designated NPF-1 and NPF-2 and a 
primer designed to anneal at around the Spel site is designated NPR-3. The nucleotide sequences thereof are shown 

25 in SEQ ID Nos. 27, 28 and 29 of the Sequence Listing, respectively. 

A chromosomal DNA of Pvrococcus furiosus is digested with Sad or Xbal (both manufactured by Takara Shuzo 
Co.. Ltd.), respectively, which is allowed to intramolecullarly ligate, and this reaction mixture can be used as a template 
for the inverse PCR. When a chromosomal DNA is digested with Sad, an about 3 kb fragment is amplified by the 
inverse PCR, which is inserted into the plasmid vector pT7BlueT (manufactured by Novagen) to obtain a recombinant 

30 plasmid which was designated plasmid pS322. On the other hand, in a case of a chromosomal DNA digested with Xbal, 
an about 9 kb fragment is amplified. The amplified fragment is digested with Xbal to obtain an about 5 kb fragment 
which is recovered and inserted into the plasmid vector pBluescript SK(-) to obtain a recombinant plasmid, which was 
designated plasmid pSKX5. By combining the results of southern hybridization performed using the Spel-Sacl fragment 
contained in the plasmid pSS3 as a probe, and those of analysis on the plasmids pS322 and pSKXS with the restriction 

35 enzymes, a restriction map of the protease PFUS gene and the vicinity thereof in a chromosome can be obtained. The 
restriction map is shown in Fig. 13. 

In addition, by analyzing the nucleotide sequence on a 5' fragment inserted into the plasmid pS322 in a 5' direction 
starting from the Spel site, the amino acid sequence of an enzyme protein encoded by the region can be deduced. The 
resulting nucleotide sequence and the amino acid sequence deduced therefrom are shown in SEQ ID No. 30 of the 

40 Sequence Listing. Since the amino acid sequence of this region has the homology with that of a protease such as sub- 
tilisin or the like, an initiation codon of the protease PFUS can be presumed based on this homology and, thus, primer 
NPF-4 which can introduce the BamHI site in front of the initiation codon of the protease PFUS can be designed. On 
the other hand, the nucleotide sequence determined by analyzing the nucleotide sequence of a 3' fragment of the pro- 
tease PFUS gene inserted into the plasmid pSKXS in a 5' direction starting from the Xbal site is shown in SEQ ID No. 

45 31 of the Sequence Listing. Based on the nucleotide sequence, the primer NPR-4 which can insert the Sphl site into 
the vicinity of the Xbal site can be designed: The nucleotide sequences of the primers NPF-4 and NPR-4 are shown in 
SEQ ID Nos. 32 and 33 of the Sequence Listing, respectively. The full length protease PFUS gene can be amplified by 
using these two primers and using a chromosomal DNA of Pvrococcus furiosus as a template. 

The protease PFUS can be expressed in the Bacillus subtilis system, as in a case of the protease TCES. A plasmid 

so for expressing the protease PFUS can be constructed based on the expression plasmid pSTC3 for the protease TCES. 
First, a DNA fragment containing the full length protease PFUS gene which can be amplified by the PCR is digested 
with BamHI and Sacl to recover an about 0.8 kb fragment encoding a N-terminal part of the enzyme. And this fragment 
is replaced with the BamHI-SacI fragment, also encoding a N-terminal part of the protease TCES, of the plasmid 
pSTC3. The resulting expression plasmid encoding a hybrid protein of the protease TCES and the protease PFUS gene 

55 was designated the plasmid pSPT1 . Fig, 1 4 shows a restriction map of the plasmid pSPTl . 

Then, the above PCR-ampl'rfied DNA fragment is digested with Spel and Sphl to give an about 5.7 t<b fragment 
which is isolated and replaced with the Spel-Sphl fragment encoding a C-terminal part of the protease TCES in the 
plasmid pSPTl . The expression plasmid thus constructed was designated plasmid pSNP1 , and Bacillus subtilis DB104 
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transformed with the plasmid was designated Bqcill M s subtilis DB104/pSNP1 and has been deposited at National Insti- 
tute oZS ; anS Human-Technology (NIBH) at 1-1-3, Higashi. Tsukuba-shi. Ibaraki-ken. Japan since December 
1 S f Sa' deposit date) under the accession number PERM BP-5634 under Budapest Treaty. F.g. 15 shows a 
restriction map of the plasmid pSNP1 . 

The Bacillus subtilis DB104/pSNP1 is cultured and a culture supernatant and an extract from the cells are exam- 
ined for the protease activity and it is found that the thermostable protease activity is found ,n both samples. . 

The nudeotide sequence of a gene encoding the protease PFUS can be determined by 
inserted into the plasmid P SNP1 with a restriction enzyme into the appropriate s. 2 ed fragments, subclonmg the frag- 
STnWUU vector, and conducting the dideoxy method using the subdoned fragments as a tern- 
Sate SEQ ID No 34 of the Sequence Usting shows the nucleotide sequence of open reading frame present in the 
nucleoMe sequence thus obtained. In addition. SEQ ID No. 35 of the Sequence Usting shows the ammo acid sequence 
nf the orotease PFUS deduced from the nucleotide sequence. p^^*.**.™ • 

cultured the protease activity is found in both a culture supernatant and an extract from the cells. SEQ ID No. 6 of the 
tS^tkSS^ the nucleotide sequence of open reading frame encoding a hybnd protein of the protease 
TCES ard the protease PFUS. In addition. SEQ ID No. 5 of the Sequence Usting shows the ammo acid sequence of 
the hybrid protein deduced from the nucleotide sequence. 

An amount of an expressed protease of the present invention can be increased by ut.Hz.ng a gene M ,s h«hly 
expressed in BadHus subtHis, particularly a secretory protein gene. As such a gene, the genes d r««"£» *• 
i Various extraceliulaToroteases can be used. For example, an amount of the expressed protease PFUS can be 
SSSL^^K^^ and the signal sequence of subtilisin. That is. by ligating the full length protease 
P FU^^ne to^ownstream of a region encoding the signal sequence of subtilisin gene so that ^^^^ 
both genes coincide with each other, the protease PFUS can be expressed as a fusion protein under the control of sub- 

5 ti,iSi As e tne pZoter' and the signal sequence of subtilisin. those of subtilisin gene, which are inserted into the plasmid 
pKWZ described in J . Baderiol.. volume 171. page 2657-2665 (1989) can be used The "^f^ 8 ^^ 6 
aene 's described in the above literature for a 5 upstream region containing the promoter sequence and .n J. Bacterid., 
volume " ^« (1984) for a region encoding subtilisin. respedively. Based 

SUB4 for introducing the EcoRI site upstream of the promoter sequence of the gene, and the primer BmRI for .rrtroduc- 
,o mg the B^HIsite behind a region encoding the signal sequence of subtilisin are '^SS^SSf^SS' 
Nos 36 and 37 of the Sequence Usting show the nudeotide sequences of the primers SUB4 and Bmfll .^M 
By using the primers SUB4 and BmRI. an about 0.3 Kb DNA fragment containing me r«o^<^ Pro- 
moter to the dgnal sequence of subtilisin gene can be amplified by PCR using the plasmid pKWZ as a template. 
The protease PFUS gene ligated downstream of the DNA fragment can be taken from a chromosomal DNA of 
35 PvkSSSus by 1 he PCR method. As a primer which hybridizes with a 5' part of the gene, the primer NPF-4 can 
S'Sn'adS. aprimer which hybridize/with a 3' part can be made after the nucleotide -^^SSS 
of a termination codbn of the gene is determined. That is. a portion of the nucleotide sequence of the plasmid pSNPD 
SS, an about 0.6 Kb fragment, produced by digestion of the P^^^^^S 
BamHI site of the plasmid vedor pUC119 is determined (the nucleot.de sequence is SEQ ID No. 38 of the Sequence 
40 Sm«MWI>i P^er NPM-1 which hybridizes with a 3' part of the protease PFUS gene and 
t^SSh^MoA^mTwU site is synthesized. SEQ ID No. 39 of the Sequence Usting shows the sequence of the 

Prim o r nToL hand, when the protease PFUS gene is ligated to the above 0.3 Kb DNA figment by ***** 
BamHI site only one BamHI site present in the gene becomes a barrier to the procedures. The pnmere ; mutRR ^and 
45 mu?FR fo removing this BamHI site by the PCR-mutagenesis method can be made based on the nMiijm 
7the protease PFUS gene shown in SEQ ID No. 34 of the Sequence Usting. The nucleotide sequences of 
'^RRSKtRF are'shown in SEQ ID Nos. 40 and 41. respectively. In addition. ^£*^ p ™^^ 
utilizina these primers glycine present at the position 560 in the amino acd sequence of the protease PFUS shown in 
SEQ D N^Tthe Sequence Usting is substituted with valine due to the nucleotide substitution wh.ch .s mtroduced 

so into the site. @ ^ ^ ^ ^ ^ ^ sequence-codinj ^egfon of 

subtSin gene can be obtained. That is. two kinds of PCRs are carried out using a chromosomal DNA of Eyrj^S 
25 « Sate and using two kinds of pairs of the primers mutRR and NPF-4. and the primers mutFR .and NPMh 
fSer the seS £?„ is dried out using a hetero duplex formed by mixing the DNA ^•^SS.^Sg 

55 PCRs as a template, and using the primers NPF-4 and NPM-1 . Thus, the full length of the about 2.4 kb protease PFUS 

gene containing no BamHI site can be amplified. , an . ... R _ m u. an H sohl 

An about 2.4 kb DNA fragment obtained by digesting the above PCR-ampl,f,ed DNA fra 9 me ^^ a ^ o H ' *~ ^' 
is isolated, and replaced with the BamHI-Sphl fragment containing the protease PFUS gene in the plasmid pSNP1 . The 
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expression plasmid thus constructed was designated pPS1 and BacjUys gubtiM DB104 transformed w J the gmmd 
was designated Bacillus SlM!is DB1 04/pPS1 . When the transformant is cultured, the simdar protease acfvrty to that n 
rise o?!S usetfthe Plasmid pSNP1 is found in both a culture supernatant and an extract from the cejs and rt is 
co^rm^*elbSon of ft. amino acids dose not affect the enzyme activity. Fig. 16 shows a restnction map 

° f ^ArabTu'fo 1 ?^ DNA fragment containing from the promoter to the signal sequence of the subtilisin J digested with 
EcoRI and BamHI and substituted with the EcoRI-BamHI fragment containing the P43 promoter and the nbosome 
bX stfn Se plasmid PPS1 . The expression plasmid thus constructed was designated pNAPSI and Mys sub, 
SiormeS wSh the plasmid was designated Ba^iHus subJfi 0B ^ mPS ^TT^^t'l^ 
o ti7e supernatant and an extract from the ceils are examined for the protease activity to ^^^SSS^ 
activity is recognized in both samples. An amount of expressed enzyme is mcreased as compared wrth BaeUus subtilis 
DB104/DSN1 Fia 17 shows a restriction map of the plasmid pNAPSI. 

By a simi'.^ method to that in a case of the protease TCES gene and the protease PFUS gene, a protease gene 
having" the homology wrth these genes can be obtained from hype^ern»Phi.es other t^ Pg^fcgff and 
s Thermococcus celer. However, in PCR using the above ol.gonucleot.des PRO-1 F. PR0-2F. PRO-2R and PRO-4R as a 
primer and usin iTchromosomal DNA of StaphylothermuS ffiadnus DSM3639 and that of Th^Ct^espy- 
?Ss DSM5265 as a template, the amplification of a DNA fragment as found in Thermococcus sMer was no found, 
^n aS it is known that the efficiency of gene amplification by PCR is largely influenced b> r *e ettciency of 
annealing of a 3' terminal part of a primer and a template DNA. Even when the amplif.cat.on of a DNA by PCR .s not 
>o Served a protease gene can be detected by synthesizing and using the oligonucleotides having the drfferent ^nude- 
Sequence from that used this time but encoding the same amino acid sequence. Alternately, a protease gene 
can beX detected by conducting southern hybridization using a chromosomal DNA and using the above ol.gonuc.e- 
otides or a portion of other hyperthermostable protease genes as a probe. 

Xi about 1 Kb DNA fragment encoding the sequence of residue 323 to residue 650 of the am.no acrf sequence of 
2 5 the orotease PFUL represented by SEQ ID No. 8 of the Sequence Listing is prepared, and th.s can be used as a probe 
o coSuXnom^ Suthern hybridization using a chromosomal DNA of Staphylothermus marjnus DSM3639 and that 
i r^^acteroides prgttdiLs DSM5265. As a result, when the .^ phylothermus ' m ^ s ^ ^ D ^ 
digested wrth Pst. (manufactured by Takara Shuzo Co.. Ltd ) is used, a s.gna. »*^^^J^j^ 
kb On the other hand, when the Tharmobacteroides proteoliticus chromosomal DNA d.gested w.th Xbal .s used, a sig 

30 ^ I^^^^SS^^ the homology *rth the protease PFUL. the protease PFUS and^the 
protease TCES gen^fs present alsoin the Staphylothermus marinus and Thermobacteroides BBl^DNArtg. 
mosomes From the DNA fragment thus detected, a gene encoding a hyperthermostable protease present .n§te 
"thermus marls or TH^Lcteroides can be isolated and identfied by using the same method as that 

as when the aene encodinq the protease TCES or the protease PFUS is isolated and identified. 

Wh6 T he rtSSSSf the protease TCES gene, a hyperthermostable protease ^^^^ 7 

is introduced ( Bacillus subtilis DB104/pSTC3) expresses a hyperthermostable protease .n a culture by eu^ngat 37 
•C in LB medium containing 10 ug/ml kanamycin. After the completion of cultivat.on, crude enzyme preparation .s 
obta]nS b^Tub^ng centrrtugaJon of a culture to collect a supernatant, and salting out with ammon.umsulfate and 
40 T^usleaSrenzyrSe preparation obtained from BaQfeSU&JiS DB104/pSTC3 was designated TC-3^ 

AccoS to the similar procedures, a crude enzyme preparation can be obtained from the transformant Bjcjjug 
subt^B 0 9 1 pSNPI in which the protease PFUS gene is introduced, or ^^f^ s ^S|S 
Dil04/oSPT1 in which a gene encoding a hybrid protease of the protease TCES and the pro ease PFUS. Crude 
^^^Z<iJne6 from Baartus subtilis. DB104/pSNP1 and Bac^ subtilis DB104/pSPT1 were des.g- 

45 ^^^Z^mOB^^ in which the protease PFUS gen, 

gene of the present invention, is introduced expresses a hyperthermostable protease .n the cells or culture under con 
ventionaL conditions for example, by culturing at 37 °C in LB medium containing 10 ug/ml kanamycin After corr.plet.on 
d fSSZ Tc^ ^s^lna^ are separated by centrifugation. from either of which a crude enzyme prep- 

so aration of the protease PFUS can be obtained by the following procedures. 

When an enzyme is purified from the cells, the cells are first lysed by the lysozyme treatment the lysate i .^heat 
treated and centrif uged to recover a supernatant. This supernatant can be fractionated with ammonwm, surtate and sub- 
bed to hydrophobic chromatography to obtain a purif ied enzyme. The purified enzyme preparation thus obtamed from 
Bacillus subtilis DB104/pNAPS1 was designated NAPS-1. 

55 of tffoiner hand the culture supernatant is dialyzed and subjected to anion-exchange ^om^raphy The 
eluted adite fractions can be collected, heat-treated, fractionated with ammonium sulfate, and subjected to ^hydrcpho- 
bic chromatography to obtain a purified enzyme of the protease PFUS. The purified enzyme preparation was desig- 
nated NAPS-1 S. 
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When the purified products NAPS-1 and NAPS-1 S thus obtained are subjected to SDS-polyacrylamide gel electro- 
phoresis, both enzyme preparation show a single band corresponding to a molecular weight of about 45 kDa. These 
two enzyme preparation are substantially the same enzyme preparation which have been converted into a mature 
(active-type) enzyme by removing a pro-sequence by heat-treatment during the purification procedures. 

The protease preparation produced by the transformants in which a hyperthermostable protease gene obtained by 
the present invention is introduced, for example, TC-3, NP-1, PT-1 , NAPS-1 and NAPS-IS have the following enzymatic 
and physicochemical properties. 

(1) Activity 

The enzymes obtained in the present invention hydrolyze gelatin to produce the short-chain polypeptides. In addi- 
tion, the enzymes hydrolyze casein to produce short-chain polypeptides. 

In addition, the enzymes obtained in the present invention hydrolyze succinyl-L-leucyl-L-leucyl-L-valyl-L-tyrosine- 
4-methylcoumarin-7-amide (Suc-Leu-Leu-Val-Tyr-MCA) to produce a fluorescent material (7-amino-4-methylcou- 
marin). 

Further, the enzymes obtained in the present invention hydrolyze succinyl-L-alanyl-L-alanyl-L-proiyl-L-phenyla- 
lanine-p-nitroanilide (Suc-Ala-Ala-Pro-Phe-p-NA) to produce a yellow material (p-nitroaniline). 

(2) Method for measuring enzyme activity 

The enzyme activity of the enzyme preparations obtained in the present invention can be measured using a syn- 
thetic peptide substrate. 

The enzyme activity of the enzyme preparation TC-3 obtained in the present invention can be measured using as 
a substrate Suc-Leu-Leu-Val-Tyr-MCA (manufactured by Peptide Laboratory). That is, the enzyme preparation to be 
detected for the enzyme activity is appropriately diluted, to 20 \A of the solution is added 80 foi of a 0.1 M sodium phos- 
phate buffer (pH 7.0) containing 62.5 \M Suc-Leu-Leu-Val-Tyr-MCA, followed by incubating at 75 °C for 30 minutes. 
After the reaction is stopped by the addition of 20 ^ of 30% acetic acid, the fluorescent intensity is measured at the exci- 
tation wavelength of 355 nm and the fluorescence wavelength of 460 nm to quantitate an amount of the generated 7- 
amino-4-methylcoumarin, and the resulting value is compared with that obtained when incubating without the addition 
of the enzyme preparation, to investigate the enzyme activity. The enzyme preparation TC-3 obtained by the present 
invention had the Suc-Leu-Leu-Val-Tyr-MCA hydrolyzing activity measured at pH 7.0 and 75 °C. 

In addition, the enzyme activity of the enzyme preparations NP-1 , PT-1 , NAPS-1 and NAPS-1 S can be photomet- 
rically measured using Sue- Ala-Ala-Pro-Phe-p-NA (manufactured by Sigma) as a substrate. That is. an enzyme prep- 
aration to be detected for the enzyme activity was appropriately diluted, to 50 jJ of the solution was added 50 \i\ of a 
0.1 M potassium phosphate buffer (pH 7.0) containing Suc-Ala-AIa-Pro-Phe-p-NA (Suc-Ala-AIa-Pro-Phe-p-NA solu- 
tion), followed by incubating at 95 °C for 30 minutes. After the reaction was stopped by ice-cooling, the absorbance at 
405 nm was measured to quantitate an amount of the generated p-nitroaniline, and the resulting value was compared 
with that when incubating without the addition of the enzyme preparation, to investigate the enzyme activity. Upon this, 
a 0.2 mM solution of Suc-Ala-Ala-Pro-Phe-p-NA was used for the enzyme preparations NP-1 and PT-1 and a 1 mM 
solution was used for the enzyme preparations NAPS-1 and NAPS-1S. The enzyme preparations NP-1 , PT-1 , NAPS-1 
and NAPS-1 S obtained by the present invention have the Suc-Ala-Ala-Pro-Phe-p-NA hydrolyzing activity at measured 
pH 7.0 and 95 °C. 

(3) Detection of activity on various substrates 

The activity of the enzyme preparations obtained in the present invention on the synthetic peptide substrate? is 
confirmed by a method for measuring the enzyme activity described in the above (2). That is, the enzyme preparation 
TC-3 obtained in the present invention has the Suc-Leu-Leu-Val-Tyr-MCA hydrlyzing activity, and the enzyme prepara- 
tions NP-1 , PT-1, NAPS-1 and NAPS-1 A have the Suc-AIa-Ala-Pro-Phe-p-NA hydrlyzing activity, respectively. In addi- 
tion, the enzyme preparations NP-1 , PT-1, NAPS-1 and NAPS-1 S were investigated for the Suc-Leu-Leu-Val-Tyr-MCA 
hydrolyzing activity by the enzyme activity measuring method described in the above (2) used for the enzyme prepara- 
tion TC-3, and it was shown that these enzyme preparations had the activity to degrade the substrates. Further, the 
enzyme preparation TC-3 was investigated for the Suc-Ala-AIa-Pro-Phe-p-NA hydrolyzing activity by the enzyme activ- 
ity measuring method described in the above (2) used for the enzyme preparations NP-1 and PT-1 , and the activity to 
degrade the substrate was recognized. In addition, the activity of the enzyme preparations obtained in the present 
invention on gelatin can be detected by confirming the degradation of gelatin by an enzyme on the SDS-polyacrylamide 
gel. That is, the enzyme preparation to be detected for the enzyme activity was appropriately diluted, to 10 nl of the 
sample solution was added 2.5 pi of a sample buffer (50 mM Tris-HCI, pH 7.5, 5% SDS, 5% 2-mercaptoethanol. 0.005% 
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Bromophenol Blue. 50% glycerol), followed by treatment at 100 -C for 5 minutes and electrophoresis^ £ ^*fD8- 
10% odyacrylamide gel containing 0.05% gelatin. After the completion of run. the gel was soaked m a 50 mM potas- 
s£m DhoSate buffed (pH 7.0). and incubated at 95 -C for 3 hours to carry out the enzyme reaction. Then the gel was 
SSSwl Brilliant Blue R-250. 25% ethanol and 10% acetic acid for 30 minutes, and fransferred hn 
7% acetic acid to remove the excess dye over 3 to 1 5 hours. The presence of the protease activity was detected by the 
STflS hydrolyzed by a protease into peptides which are diffused out of the gel ^.^uer^y ^rele- 
vant portion of the gel was not stained with Coomassie Brilliant Blue. The enzyme preparations TC-3. NP-1. PT-1. 
NAPS-1 and NAPS-1S obtained by the present invention had the gelatin hydrolyzing activity at 95 c. 

In addition, the enzyme preparations NP-1 . NAPS-1 and NSPA-1 S derived from the protease PFUS gene are rec- 
ognized to have the gelatin hydrolyzing activity at the almost same positions on the gel in the above act-vrty measuring 
meThcd. From this. lis shown that, in these enzyme preparations, the processing from a precursor enzyme into a 
mature tvoe enzyme occurs in the similar manner. 

Further, the hydrolyzing activity on casein can be detected according to the same method as that ^^etectng 
the activity on gelatin except that 0.1% SDS-10% polyacrylamide gel contain.ng 0.05% casein is used The enzyme 
V%SZ^!!hp^^ NAPS-1 and NAPS-1 S obtained by the present invention had the casein hydrolyzing 

3Cti MemSveW. the casein hydrolyzing activity of the enzyme preparations TG-3. NP-1. NAPS-1 and JVIAPS-1S 
obtaCfX *e present invention can be measured by the following method. 100 »l of an appropriately diluted enzyme 
nnwSon was added to 100 pi of a 0.1 M potassium phosphate buffer (pH 7.0) containing 0.2% casein, incubated at 
for hour and the reaction was stopped by the addition of 100 pi of 15% trichloroacetic acid. An amount of an 
polypeptide contained in the supernatant obtained by centrrfugation of this react™ m,xture 
It de e'mined from the'absorbance at 280 nm and compared whh that when incubating wrihout the .n 
enzyme preparation, to investigate the enzyme activity. The enzyme preparat.ons TC-3. NP-1. NAPS-1 and NAPS 1S 
obtained by the present invention had the casein hydrolyzing activity at 95 "C. 

(4) Optimum temperature 

"■ The optimum temperature of the enzyme preparation TC-3 obtained by the present invention was investigated 
using theTnzyme activity measuring method shown in the above (2) except for varying a temperature. As shown in Fig. 
?8 the : enzyme proration TC-3 showed the activity at a temperature of 37 to 95 'C and the optimum temperature 
thereS was7o t 8?'C. That is. Fig. 18 is a ffcure showing the relationship between the activity of the enzyme prepa- 
Son tS ?«5*£ In ^e present Invention and a temperature, and the ordinate shows the relative activity to the max- 
imum activity (%) and the abscissa shows a temperature. 

Tn Z ton he optimum temperature of the enzyme preparation NAPS-1 obtained in the present ™«*«"" 
investigated by using °he enzyme activity measuring method shown in the above (2) except for varying a tempera ura 
Asshown in Fig 19 the enzyme preparation NAPS-1 had the activity at a temperature between 40 to 1 10 -C at the 
meSuTg condftions of P H 7.0. and °he optimum temperature being 80 to 95 «C. That is. Fig. 19 ,s 
^relationship between the activity of the enzyme preparation NAPS-1 obtained in the present .nventon and a tem- 
perlSre and the ordinate shows the relative activity to the maximum activity (%) and the abscissa shows a tempera- 
40 ture. 
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(5) Optimum pH 

The ootimum pH of the enzyme preparation TC-3 obtained by the present invention was investigated by the 
enzyme art^ measuring methcTshownln the above (2). That is. the Suc-Leu-Leu-Val-Tyr-MCA solutions ; were £re- 
pareTusing the buffers having various pHs. and the enzyme activities obtained »£r^*^J£^ 
oared As abutter a sodium acetate buffer was used at pH 3 to 6. a sodium phosphate buffer at pH 6 to 8. a sodium 
bo S ^er atTA 8 to 9. and a sodium phosphate-sodium hydroxide buffer at pH 10 to 1 1. As ^hown ,n Rg. 20. *. 
enzvme oreoaration TC-3 shows the activity at pH 5.5 to 9. and the optimum pH was pH 7 to 8. That is, Rg. 20 is a 
fiSrshZX^elalionship between the activity of the enzyme preparation TC-3 obtained in the present invention 
and pH. and the ordinate shows the relative activity (%) and the abscissa shows pH. im/octinate d bv 

In addition the optimum pH of the enzyme preparation NP-1 obtained in the present invention was investigated by 
the method shown in the above (2). That is. the Suc-Ala-Ala-Pro-Phe-pNA solutons were 

prepaTby using the buffers Saving various pHs. and the enzyme activities obtained by using these solution wer com- 
^Ta buffer, a sodium acetate buffer was used at pH 4 to 6. a potassium phosphate at pH 6 to 8. a sodium borate 
SfVr at pH 5 to 0. and a sodium phosphate-sodium hydroxide buffer at pH 10.5. As shown in Fig. 21. the ^ eruyme 
^ration NP-1 shows the activity at pH 5 to 1 0. and the optimum pH was P H 5£ to a That ,s, Fig^l is i af .gure show^ 
ing the relationship between the activity of the enzyme preparation NP-1 obtained in the present .nvention and pH. and 
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the ordinate shows the relative activity (%) and the abscissa shows pH. 

Further, the optimum pH of the enzyme preparation NAPS-1 obtained in the present invention was investigated by 
the enzyme activity measuring method shown in the above (2). That is, the Suc-Ala-Ala-Pro-Phe-pNA solutions were 
prepared by using the buffers having various pHs, and the enzyme activities obtained by using these solution were com- 
5 pared. As a buffer, a sodium acetate buffer was used at pH 4 to 6. a potassium phosphate at pH 6 to 8, a sodium borate 
buffer at pH 8.5 to 10. As shown in Fig. 22, the enzyme preparation NAPS-1 shows the activity at pH 5 to 10, and the 
optimum pH was pH 6 to 8. That is, Fig. 22 is a figure showing the relationship between the activity of the enzyme prep- 
aration NAPS-1 obtained in the present invention and pH, and the ordinate shows the relative activity (%) and the 
abscissa shows pH. 

10 

(6) Thermostability 

The thermostability of the enzyme preparation TC-3 obtained by the present invention was investigated. That is. the 
enzyme preparation was incubated at 80 °C in 20 mM Tris-HCI, pH 7.5 for various periods of time, an appropriate 

is amount thereof was taken to measure the enzyme activity by the method shown in the above (2). and the activity was 
compared with that when not heat-treated. As shown in Fig. 23, the enzyme preparation TC-3 obtained by the present 
invention had not less than 90% of the activity even after the heat-treatment for 3 hours and, thus, was stable on the 
above heat-treatment. That is, Fig. 23 is a f igure showing the thermostability of the enzyme preparation TC-3 obtained 
in the present invention, and the ordinate shows the residual activity (%) after the heat-treatment and the abscissa 

20 shows time. 

In addition, the thermostability of the enzyme preparation NP-1 obtained in the present invention was investigated. 
That is, the enzyme preparation was incubated at 95 °C in 20 mM Tris-HCI, pH 7.5 for various periods of time, an appro- 
priate aliquot was taken to determine the enzyme activity by the method shown in the above (2), and the enzyme activity 
was compared with that when not heat-treated. As shown in Fig. 24, the enzyme preparation NP-1 obtained in the 

25 present invention is observed to have the remarkably increased enzyme activity when incubated at 95 °C. This is con- 
sidered to be because a protease produced as a precursor causes the self-catalytic activation during the heat-treat- 
ment. In addition, no decrease in the activity was recognized in the heat-treatment for up to 3 hours. That is, Fig. 24 is 
a figure showing the thermostability of the enzyme preparation NP-1 obtained in the present invention, and the ordinate 
shows the residual activity (%) after the heat-treatment and the abscissa shows the time. 

30 In addition, the above enzyme preparation NP-i activated by the heat-treatment was investigated for the ther- 
mostability. That is, the enzyme preparation NP-1 was activated by the heat-treatment at 95 ?C for 30 minutes, incu- 
bated at 95 °C for various periods of time, and the activity was determined as described above to compare with that 
when not heat-treated. At the same time, buffers having the various pHs (sodium acetate buffer at pH 5, potassium 
phosphate buffer at pH 7, sodium borate buffer at pH 9, sodium phosphate-sodium hydroxide buffer at pH 1 1 , 20 mM in 

35 every case) were used. As shown in Fig. 25, when the activated enzyme preparation NP-1 obtained in the present 
invention was treated in a buffer at pH 9, it had not less than 90% of the activity after the heat-treatment for 8 hours and 
approximately 50% of the activity even after the heat-treatment for 24 hours and, thus, being very stable to the above 
heat-treatment. That is, Fig. 25 is a figure showing the thermostability of the enzyme preparation NP-1 obtained in the 
present invention, and the ordinate shows the residua! activity (%) after the heat-treatment and the abscissa shows the 

40 time. 

In addition, the enzyme preparation NAPS-1 obtained by the present invention was investigated for the thermosta- 
bility. That is. a temperature of the enzyme preparation was maintained at 95 °C in 20 mM Tris-HCI, pH 7.5 for various 
periods of time, an appropriate aliquot was taken to determine the enzyme activity by the method shown in the above 
(2) to compare with that when not heat-treated. As shown in Fig. 26, the enzyme preparation NAPS-1 obtained by the 
45 present invention had not less than 80% of the activity even after the heat-treatment at 95 °C for 3 hours and, thus, 
being stable against the above heat-treatment. That is, Fig. 26 is a figure showing the thermostability of the enzyme 
preparation NAPS obtained in the present invention, and the ordinate shows the residual activity (%) after the heat- 
treatment and the abscissa shows the time. 

50 (7) pH stability 

The pH stability of the enzyme preparation NP-1 obtained by the present invention was investigated according to 
the following procedures. Each 50 ^l of 20 mM buffers at various pHs, which contain the enzyme preparation NP-1 acti- 
vated by the heat-treatment at 95 °C for 30 minutes, was treated at 95 °C for 60 minutes, and an appropriate aliquot 
55 was taken to determine the enzyme activity by the method shown in the above (2) to compare with that when not 
treated. As a buffer, a sodium acetate buffer was used at pH 4 to 6. a potassium phosphate buffer at pH 6 to 8, a sodium 
borate buffer at pH 9 to 10, a sodium phosphate-sodium hydroxide buffer at pH 11. As shown in Fig. 27, the enzyme 
preparation NP-1 obtained by the present invention retained not less than 95% of the activity even after the treatment 
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* Is o C for 60 minutes at pH between 5 and 11. That is. Fig. 27 is a figure showing the pH stability of the enzyme 
obSn* by t^e present inle'Ln. and the ordinate shows the residua, activity (%) and abscissa shows pH. 



(8) Stability to detergent 
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SDS as deterge* The enzyme prepared NP-1 ^^tX?£*jSSSS*g SOS 10 the final concentration 
„1 01 a solution containing only the enzyme preparation and a soluton "5™"? ^ o) „ an appropriate 

amount thereof was taken to determine the enzyme aw xy y _ obtained by the present invention 

that when not treated. As shown in Fig. 28 the actuated enzyme prepare mi npto ^ £ 

had not less than 80% of the ad^ 

— ^™ c TXi ^XSS^J^S^ to SDS o) the enzyme preparation NP-1 

investigated using SDS as aetergern. ecu, ■ ^ J* ft(ni¥n . 1% was Dr eoared These so utions was incubated 

solution further containing SDS to the f.nal concentration of °;1^ 0 ' activrty by the method 
at 95 "C for various periods of time, an approbate aliquot «astakerv preparation NAPS- 
described in the above (2) to compare with that when S -C for 3 hours 
1 obtained by the present ^^^ X % S sS to SDS of the activated enzyme 

^SSSS^SSie ordinate shows the resfdu, activity (%) and »e 

abscissa shovrs the time. »« e h~„„ihatih.enzvmeDteoaralionNAPS-l has remarkably decreased 

V»tfantbeabove,esultsare,^^^ 

(9) Stability to organic solvent 

determine the - J^^^^S5S5 "haS'fhe a*Hy of no. less thah 80% 0, .ha. 
30. the enzyme preparaton NAPS-1 < »MJ oy_ ™ P presence of 50% aeelonitrile. Thet is. Fig. 30 is 

(10) Stability to denaturing agent 

The stability to various denaturing ^ 

was investigated using urea and 9^!^ concentration of 1 M. 3.2 

taining urea to the final concentration of 3.2 M or 6.4 M n roprjate aliquot 

. M or 6.4 M was prepared. These solutions "f^* 3 ^ not treated. As 
vrcstakentodeterminetheac^ 

shown Fig. 31. the enzyme preparation NAPS-1 ^' n ^J**® ^ 1 hour in the pres- 
enZftlXaCS 

(1 1) Effects of various reagents 
, The effects o, various reagents on fhe enzyme preparations and ^ ^M-— 
„.re invesugated. That is, the above «»T >"J?^XSS£5 L«SZSnemeS*yma activky 

zxz%?^ r s ssrs : Lr-«T— « ^ was ^ ™ — - 
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shown in Table 1. 



Table 1 



Reagent 


TCES 


NAPS-1 


Control 


100% 


100% 


EDTA 


103.5% 


36.1% 


PMSF 


8.1% 


0.1% 


Antipain 


19.0% 


81.9% 


Chymostatin 


0% 


6.6% 


Leupeptin 


104.5% 


89.3% 


Pepstatin 


105.2% 


100.7% 


N-ethylmaleimide 


82.6% 


102.6% 



As shown in Table 1 . when treated with PMSF (phenylmethanesulfonyl fluoride) and chymostatin, both enzyme 
preparations had the remarkably decreased activity. In addition, when treated with antipain, the decrease in the activity 
was observed in TCES, and when treated with EDTA, in NAPS-1 , respectively. In a case of other reagents, the large 
decrease was not observed in the activity. 

(12) Molecular weight 

A molecular weight of the enzyme preparation NAPS-1 obtained by the present invention was determined by SDS- 
PAGE using 0.1% SDS-10% polyacryiamide gel. The enzyme preparation NAPS-1 showed a molecular weight of about 
45 kDa on SDS-PAGE. On the other hand, the enzyme preparation NAPS-1 S showed the same molecular weight as 
that of the enzyme preparation NAPS-1. 

(13) N-terminal amino acid sequence 

The N-terminal amino acid sequence of a mature enzyme, the protease PFUS, was determined using the enzyme 
preparation NAPS-1 obtained by the present invention. The enzyme preparation NAPS-1 electrophoresed on 0.1% 
SDS-10% polyacryiamide gel was transferred onto the PVDF membrane, and the N-terminal amino acid sequence of 
the enzyme on the membrane was determined by the automated Edman degradation using a protein sequencer. The 
N-terminal amino acid sequence of the mature type protease PFUS thus determined is shown in SEQ ID No. 42 of the 
Sequence Listing. The sequence coincided with the sequence of amino acids 133 to 144 in the amino acid sequence 
of the protease PFUS represented by SEQ ID No. 35 of the Sequence Listing, and it was shown that the mature pro- 
tease PFUS is an enzyme consisting of the polypeptides including behind this part. The amino acid sequence of the 
mature protease PFUS thus revealed is represented by SEQ ID No. 3 of the Sequence Listing. In addition, as described 
above, there is no influence on the enzyme activity of the protease PFUS independently of whether 428th amino acid 
(corresponding to 560th amino acid in the amino acid sequence represented by SEQ ID No. 35 of the Sequence Listing) 
is glycine or valine. Further, within the nucleotide sequence of the protease PUFS gene represented by SEQ ID No. 34 
of the Sequence Listing, that of a region encoding the mature type enzyme is shown in SEQ ID No. 4. The 1283rd base 
in the sequence may be guanine or thimine. 

In a case of in vitro gene amplification by PCR. the misincorporation of a nucleotide may occur during the elonga- 
tion reaction, leading to the nucleotide substitution in the sequence of the resulting DNA. This frequency largely 
depends upon the kind of the enzyme used for PCR, the composition of the reaction mixture, the reaction conditions, 
the nucleotide sequence of a DNA to be amplified and the like. However, when a certain region in a gene is simply 
amplified as performed usually, the frequency is at best around one nucleotide per 400 nucleotides. In the present 
invention, PCR was used for isolation of a gene of the protease TCES or the protease PFUS or construction of the 
expression plasmid therefor. The number of nucleotide substitutions in the nucleotide sequence of the resulting gene 
is, if any, a few nucleotides. Taking into consideration the fact that the nucleotide substitution on a gene dose not nec- 
essarily lead to the amino acid substitution in the expressed protein due to degeneracy of translation codons, the 
number of the possible amino acid substitutions can be evaluated to be at best 2 to 3 in the whole residues. It cannot 
be denied that the nucleotide sequence of a gene of the protease TCES and the protease PFUS and the amino acid 
sequence of the proteases disclosed herein are different from natural ones. However, the object of the present invention 
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is to disclose a hyperthermostable protease having the high activity at high temperature and a gene encoding the same 
and, therefore, the protease and the gene are not limited to the same enzyme and the same gene encoding the same 
as the natural ones. And it is clear to those skilled in the art that even a gene having the possible nucleotide substitution 
can hybridize to a natural gene under the stringent conditions. 

Further, in the specification, a method for obtaining a gene of interest is clearly disclosed such that (1) the library 
for expression cloning is made from a chromosomal DNA of the hyperthermophiles and the expression of the protease 
activity is screened, (2) a gene possibly expressing the hyperthermostable protease is isolated by hybridization or PCR 
based on the homology of amino acid sequences, and the enzyme action of expression products of these genes, that 
is, the hyperthermostable protease activity is confirmed using an appropriate microorganism. Therefore, it can be easily 
determined by using the above method whether the gene sequence with the mutation introduced encodes a hyperther- 
mostable protease, after a variety of mutations are introduced into the hyperthermostable protease gene of the present 
invention using the known mutation introducing method. The kind of the mutation to be introduced is not limited to spec- 
ified ones as long as the gene sequence obtained as a result of the mutation introduction expresses substantially the 
same protease activity as that of the hyperthermostable protease of the present invention. However, in order that the 
expressed protein retains the protease activity, the mutation is desirably introduced into a region other than four regions 
which are conserved in common in the serine proteases. 

A mutation can be randomly introduced into any region of a gene encoding the hyperthermostable protease (ran- 
dom mutagenesis), or alternatively, a desired mutation can be introduced into a specified pre-determined position (site- 
directed mutagenesis). As a method for randomly introducing a mutation, for example, there is a method for chemically 
treating a DNA. In this case, a plasmid is prepared such that a region into which a mutation is sought to be introduced 
is partially single-stranded, and sodium bisulfite is acted on this partially single-stranded region to convert a base cyto- 
sine into uracil and, thus, introducing a transition mutation from C:G to T:A. In addition, a method for producing a base 
substitution during a process where a single-stranded part is repaired to double-strand in the presence of [a-S] dNTP 
is also known. The details of these methods are described in Proc. Natl. Acad. Sci. USA, volume 79, page 1408-1412 
(1982), and Gene, volume 64, page 313-319 (1988). 

Random mutation can also be introduced by conducting PCR under the conditions where fidelity of the nucleotide 
incorporation becomes lower. In particular, the addition of manganese to the reaction system is effective and the details 
of this method are described in Anal. Biochem., volume 224, page 347-355 (1995). As a method for introducing a site- 
directed mutation, for example, there is a method using a system where a gene of interest is made single-stranded, a 
primer designed depending upon a mutation sought to be introduced in this single-stranded part is synthesized, and the 
prjmer is annealed to the part, which is introduced into in vivo system where only the strand with a mutation introduced 
is selectively replicated. The details of this method are described in Methods in Enzymology, volume 154, page 367 
(1987). For example, a mutation introducing kit, Mutant K manufactured by Takara Shuzo Co., Ltd. can be used. Site- 
directed mutagenesis can be conducted also by PCR and the details are described of the method in PCR Technology, 
page 61 -70 (1 989), edited by Ehlich and published by Takara Shuzo Co., Ltd. Alternatively, for example, LA-PCR in vitro 
mutagenesis kit manufactured by Takara Shuzo Co., Ltd. can be used. By using the above method, a mutation of sub- 
stitution, deletion and insertion can be introduced. 

Thus, an enzyme having the similar thermostability and optimum temperature to those of the hyperthermostable 
protease of the present invention but having a little different, for example, optimum pH can be produced in a host by 
introducing a mutation using as a base the hyperthermostable protease gene of the present invention. In this case, the 
base nucleotide sequence of the hyperthermostable protease gene is not necessarily limited to the sequence derived 
from one hyperthermostable protease. 

A hybrid gene can be made by recombinating two or more hyperthermostable protease genes having a sequence 
homologous to each other, such as those disclosed by the present invention, by exchanging the homologous sequence, 
and the hybrid enzyme encoded by the gene can be produced in a host. Also in a case of a hybrid gene, whether it is 
a hyperthermostable protease gene can be determined by testing for the enzyme action of the gene product, that is, the 
protease activity. For example, by using the above plasmid pSPT1 , a hybrid proteiase of which N-terminal part is derived 
from the protease PFUS and of which C-terminal part is derived from the protease TCES can be produced in Bacillus 
subtilis. and this hybrid protease has the protease activity at 95 °C. 

The hybrid enzyme is expected to have the properties of two or more base enzymes at the same time. For example, 
when the protease TCES and the protease PFUS disclosed herein are compared, the protease TCES is superior in 
respect of the extracellular secretion efficiency and the protease PFUS is superior in respect of the thermostability. 
Since a signal sequence located at a N-terminal of the proteins has the great influence on extracellular secretion effi- 
ciency, if an expression plasmid is constructed so that a protein having, in contrast with pSPT1 t a N-terminal part 
derived from the protease TCES and a C-terminal part derived from the protease PFUS is produced, a hyperthermosta- 
ble protease having the equal thermostability to that of the protease PFUS can be secreted at the equal secretional effi- 
ciency to that of the protease TCES. In addition, since a signal sequence is cut from an enzyme when the enzyme is 
extracellularly secreted, it has little influence on the nature of the enzyme itself. Therefore, when a hyperthermostable 
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protease is produced using a mesophile, its signal sequence dose not necessarily need to be derived from hyperther- 
mophiles and a signal sequence derived from a mesophile has no problem as long as a protein of interest is extracel- 
lularly secreted at a higher efficiency. 

In particular, when a signal sequence of a secretory protein which is highly expressed in a host to be used is 
employed, a higher secretion is expected. 

Upon construction of the above hybrid gene, a recombination dose not necessarily need to be conducted site- 
directedly. Alternatively, a hybrid gene can be made, for example, by mixing two or more DNAs of a hyperthermostable 
protease gene, which are raw materials for construction of the hybrid gene, fragmenting these with a DNA degrading 
enzyme and reconstituting these fragments using a DNA polymerase. The details of this method are described in Proc. 
Natl. Acad. Sci. USA, volume 91, page 10747-10751 (1994). Also in this case, a sequence of a gene encoding a hyper- 
thermostable gene can be isolated and identified from the resulting hybrid genes by examining the hyperthermostable 
protease activity of expressed proteins as described above. In addition, it is expected that sequences encoding four 
regions common to the serine proteases are conserved in the sequences of the genes thus obtained. 

Therefore, it is clear to those skilled in the art that the resulting hybrid gene can hybridize to a DNA selected from 
the oligonucleotides PRO-1 F, PRO-2F, PRO-2R and PRO-4R having the nucleotide sequences represented by SEQ ID 
Nos. 9, 10, 11 and 12 of the Sequence Listing by the appropriate hybridization conditions. In addition, it is also clear 
that a novel hyperthermostable protease gene obtained by the above mutation introduction can hybridize to a gene hav- 
ing a DNA sequence selected from nucleotide sequences represented by SEQ ID Nos. 9, 10, 11 and 12 of the 
Sequence Listing, for example, the protease PFUL gene by the appropriate hybridization conditions. 

In the specification, we described by focusing on obtaining of a hyperthermostable gene. However, a gene encod- 
ing a novel protease having both high thermostability and other properties can be made by constructing a hybrid gene 
of the hyperthermostable protease gene of the present invention and a protease gene having a sequence homology 
with the hyperthermostable protease gene of the present invention but having no thermostability, for example, by con- 
structing a hybrid gene with a gene of subtilisin to improve the thermostability of subtilisin, to obtain a gene encoding a 
protease having the properties originally retained by subtilisin and the higher thermostability. 

In the present invention, we used Escherichia coli and Bacillus subtilis as a host into which a gene is introduced in 
order to detect the protease activity retained by a protein encoded by a gene and produce an enzyme preparation. How- 
ever, hosts into which a gene is introduced are not limited to specified ones. Any hosts can be used as long as a trans- 
forming method is established for the hosts, such as Bacillus brevis . Lactobacillus, yeast, mold fungi, animal cells, plant 
cells, insect cells and the like. Upon this, it is important that a polypeptide is folded such that an expressed protein 
becomes an active form and this does not result in the harmful or lethal effect. Among hosts listed above, Bacillus 
brevis. Lactobacillus and mold fungi which are known to secret their products in a medium can be used as a host for 
mass production of a protease of interest on an industrial scale, in addition to Bacillus subtilis . 

Examples 

The following Examples further describe the present invention in detail but are not limit the scope thereof. 
Example 1 

(1) Preparation of oligonucleotide for detection of hyperthermostable protease gene 

By comparing the amino sequence of the protease PFUL represented by SEQ ID No. 8 of the Sequence Listing 
with those of alkaline serine proteases derived from the known bacterium, the homologous amino acid sequences com- 
mon to them proved to exist. Among them, three regions were selected and the oligonucleotides were designed, which 
were used as primers for PCR to. detect hyperthermostable protease genes. 

Figs. 2, 3 and 4 show the relationship among the amino acid sequences corresponding to the above three regions 
of the protease PFUL, the nucleotide sequences of the protease PFUL gene encoding the regions, and the nucleotide 
sequences of the oligonucleotides PRO-1F, PRO-2F, PRO-2R and PRO-4R synthesized based thereon. SEQ ID Nos. 
9, 10, 11 and 12 show the nucleotide sequences of the oligonucleotides PRO-1F, PRO-2F, PRO-2R and PRO-4R, 
respectively. 

(2) Preparation of chromosomal DNA of Thermococcus celer 

10 ml of a culture of Thermococcus celer DSM2476 obtained from Deutsche Sammlung von Mikroorganismen und 
Zellkulturen GmbH was centrifuged to collect the ceils which were suspended in 100 \i\ of 50 mM Tris-HCI, pH 8.0 con- 
taining 25% sucrose. To this suspension was added 20 pi of 0.5 M EDTA and 1 0 pi of 1 0 mg/ml lysozyme, and was incu- 
bated at 20 °C for 1 hour, 800 yi\ of a SET solution (150 mM NaCl, 1mM EDTA, 20mM Tris-HCI, pH 8.0), 50 pi of 10% 
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(3) Detection of hyperthermostable protease gene by PCR 

APCRreactionnV,*™^ 

cleotides PRO-1 F and ^^J^^^^^^^ySJ^ aliquot of these reaction mixture were 
ing of 94 >C for 1 minute - 55 -C for 1 minute 72 V tor i minui • ft oli g 0nuc | e otides 

subjected toagarose gel electrophoresis, anplrficati PRO -2F and PRO-4R were 
PRO-1F and PRO-2R, and one DNA fragments «" case of the ^^ ^ U ^ A ^^ er ^ eT6t ^ 6tM 
observed. These ampl«ie^^ 

using a DNA blunting kit (manufactured by Takara Shuzo Co .J ltc^ ^ w Manufactured by Takara Shuzo 
kinase (manufactured by Takara Shuzo Co L 2 0 h JSrrihuzoCo Ud) the resulting fragments were dephospho- 

followed by sequencing of the inserted fragment b> ^^^^SioBd. sequence of the plasmid P 1 F-2R(2) con- 
Of these plasmids, the amino acid sequence deduced from the ^cteotrae _seque h deduced 

tauning an about 150 bp DNA fragment amplified " s »^^^ using oligo- 

from the nucleotide sequence of the plasm.d P 2F-4R containing an about550 ijuja g^ sequences of the 

nuclides PRO-2F and PRO-4 R , contain* .sequences STJSSb. fequence of the 

protease PFUL, subtitism and ™ « »^^i^S?Sq^ deduced therefrom and SEQ ID NO. 14 

inserted DNA fragment in the plasm.d p1 F-2R(2) and the amino a ' fragment in the plasmid p2F-4R and the 

of the Sequence Listing shows the " uc,eot ' de h ^ SEQ ID No. 13 of the Sequence 

PRO-2F and PRO-4R, respectively) used as primers for PCR. 

Fig. 5 shows a figure of a restriction map of the plasmid p2F-4H. 

(4) Screening of protease gene derived from ThqPTOWCCW celfit 

" The chromosome. DNA of B-O^J*^ 

factured by Takara Shuzo Co.. Ltd.). mixed with the lambda GEM- 

by Takara Shuzo Co.. Ltd ) in the Presence ^ ^J^^^JE ™hich was subjected in vitro packaging 
11 Xhol Half-Site Arms Vector (manufactured by Promega) to allow rang . the chromosomal 
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DNA fragments of Il^mgcocsys esler- A part of * , I brary ^J^J^-^Sd^ membrane (manu- 
red by Promega) to form the plaques on a £f m *"^*^ 1.5M NaCI. then 

Co.. Ltd.). which was subjected to 1% agarose gel electrophoresis to r ^r ^esepar "J^Jtwd by Takara Shuzo 
By using this fragment as a template and using Random 

Co.. Ltd") and [a--P,dCTP «J-«f«2^^ 0.5% SDS 

The membrane with the DNA ^ 0 *^J^^S«Jl- salmon sperm DNA) at 50 -C for 2 hours, and 
0.1% SBA. 0.1% polyvinylpyrrolidone. 0.1 A Rcoti 400 «n A ae hybrid ization at 50 -C for 15 hours, 

transferred to the same buffer containing the Peeled DNA probe, who ea ^ y ^ fe 

After the conp.etion of hybridization^he membrane was gashed w,* 2 xSSC con*«Q ^ i ^ ^ ^ ^ ^ 

ature. then with 1 x SSC £^ ™ obtain an autoradiogram. About 3.000 phage Cones were 
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taining 1% chloroform. 

(5) Detection of phage DNA fragment containing protease gene derived from Thermococcus celer 

Transduced Escherichia coii LE392 using the above phage clone was cultured in the N2CMY medium (manufac- 
tured by Bio101) at 37 °C for 15 hours to obtain a culture, from which a supernatant was collected to prepare a phage 
DNA using QIAGEN-lambda kit (manufactured by QIAGEN). The resulting phage DNAs were digested with BamHI, 
EcoRI, EcoRV, Hindi, Kpnl, Ncoi, Pstl, Sad, Sail, Smal and Sphl (all manufactured by Takara Shuzo Co.. Ltd.), 
respectively, followed by agarose gel electrophoresis. Then, DNAs were transferred from the gel to Hybond-N+ mem- 
brane according to the southern transfer method described in Molecular Cloning; A Laboratory Manual, 2nd edition 
(1986), edited by T. Maniatis, et al., published by Cold Spring Harbor Laboratory. 

The resulting membrane was treated in a hybridization buffer at 50 °C for 4 hours, and transferred to the same 
buffer containing the 32 P-labeled DNA probe used in Example 1-(4), followed by hybridization at 50 °C for 18 hours. 
After the completion of hybridization, the membrane was washed in 1 x SSC containing 0.5% SDS at 50 °C, then rinsed 
with 1 x SSC and air dried. The membrane was exposed to a X-ray film at -80 °C for 6 hours to obtain an autoradiogram. 
This autoradiogram indicated that an about 9 kb DNA fragment contained a protease gene in case of the phage DNA 
digested whh Kpnl. 

Then, the phage DNA containing the above protease gene was digested with Kpnl, and further digested succes- 
sively with BamHI, Pstl and Sphl, followed by 1% agarose gel electrophoresis. According to the similar procedures to 
those described above, southern hybridization was conducted and it was indicated that an about 5 kb Kpnl-BamHI frag- 
ment contained a protease gene. 

(6) Cloning of DNA fragment containing protease gene derived from Thermococcus celer 

The phage DNA containing the above protease gene was digested with Kpnl and BamHI, which was subjected to 
1% agarose gel electrophoresis to separate and isolate an about 5 kb DNA fragment from the gel. Then, the plasmid 
vector pUCH9 ( manufactured by Takara Shuzo Co., Ltd.) was digested with Kpnl and BamHI, which was mixed with 
the above about 5 kb DNA fragment to allow to ligate, followed by introduction into Escherichia coli JM109. Plasmids 
were prepared form the resulting transformant the plasmid containing the about 5 kb DNA fragment was selected and 
designated the plasmid pTC3. 

Fig, 6 shows a restriction map of the plasmid pTC3. 

(7) Preparation of plasmid pTCS6 containing protease gene derived from Thermococcus celer 

The above plasmid pTC3 was digested with Sacl, which was electrophoresed using 1% agarose gel, and southern 
hybridization was carried out according to the same manner as that described in Example 1 -(5) for detecting the phage 
DNA fragment containing a protease gene. A signal on the resulting autoradiogram indicated that an about 1 .9 kb DNA 
fragment obtained by digesting the plasmid pTC3 with Sacl contained a hyperthermostable protease gene. 

Then, the plasmid pTC3 was digested with Sacl, which was subjected to 1% agarose gel electrophoresis to isolate 
an about 1.9 kb DNA fragment Then, the plasmid vector pUCH8 ( manufactured by Takara Shuzo Co., Ltd.) was 
digested with Sacl, which was dephosphorylated using alkaline phosphatase and mixed with the about 1 .9 kb fragment 
to allow to ligate, followed by introduction into Escherichia coli JM 1 09. Plasmids were prepared from the resulting trans- 
formant, and the plasmid containing only one molecule of the about 1 .9 kb fragment was selected and designated the 
plasmid pTCS6. 

Fig. 7 shows a restriction map of the plasmid pTCS6. 

(8) Determination of nucleotide sequence of DNA fragment derived from Thermococcus celer contained in plasmid 
pTCS6 

In order to determine the nucleotide sequence of the protease gene derived from Thermococcus celer inserted into 
the plasmid pTCS6, the deletion mutants wherein the DNA fragment portion inserted into the plasmid had been deleted 
in various length were prepared using the Kilo Sequence Deletion Kit (manufactured by Takara Shuzo Co., Ltd.). Among 
them, several mutants having suitable length of deletion were selected and the nucleotide sequence of each of the 
inserted DNA fragment parts was determined by the dideoxy method, and these results were combined to determine 
the nucleotide sequence of the inserted DNA fragment contained in the plasmid pTCS6. SEQ ID No. 15 of the 
Sequence Listing shows the resulting nucleotide sequence. 
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sette primer 
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Fig. 8 shows a restriction map of the plasmid pTC4. 
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pTC4 and protease TCES gene 
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and designated the plasmid pBTCS. ■ . whirh was blunt-ended and was sub- . 

Then, the plasmid pBTCS was completely digested w,th H,nd 111 ^ W pSmids we^e prepared from the 

^^^^ 

the Sacl site of the plasmid pTCS6 and can ,n ^ uce ^.^'^™ e s^tSmer TCE12 and the primer TCE20R. 



40 



45 



50 



55 



22 



EP 0 870 833 A1 



coccus celer as a template. A reaction of 25 cycles, each cycle consisting of 94 °C for 30 seconds - 55 °C for 1 minute 
- 72 °C for 1 minute, was carried out to amplify an about 0.9 kb DNA fragment having these two oligonucleotides on 
both ends and containing a part of the protease TCES gene. 

The above about 0.9 kb DNA fragment was digested with EcoRI and Clal (manufactured by Takara Shuzo Co., 
Ltd.), which was mixed with the EcoRI-Clal digested plasmid pBTCSHKB to allow to ligate, followed by introduction into 
Escherichia coli JM1 09. Plasmids were prepared from the resulting transformant, and the plasmid containing only one 
molecule of the about 0.9 kb fragment was selected and designated plasmid pBTC6. 

(12) Preparation of plasmid pTC12 containing protease TCES gene . 

The plasmid pBTC6 was digested with BamHI and Sphl, which was subjected to 1% agarose gel electrophoresis 
to recover the separated about 3 kb DNA fragment. Then, the plasmid pUC-P43SD where the ribosome binding site 
sequence derived from Bacillus subtilis P43 promoter was introduced between the Kpnl site and the BamHI site of the 
plasmid vector pUC18 (manufactured by Takara Shuzo Co., Ltd.) (the nucleotide sequence of the synthetic oligonucle- 
otides BS1 and BS2 used for introduction of the sequence are shown in SEQ ID Nos. 20 and 21 of the Sequence List- 
ing) was digested with BamHI and Sphl, which was mixed with the previously recovered about 3 kb DNA fragment to 
allow to ligate. followed by introduction into Escherichia coli JM109. Plasmids were prepared from the resulting trans- 
formant, the plasmid containing only one molecule of the above about 3 kb DNA fragment was selected and designated 
plasmid pTC12. 

(13) Preparation of plasmid pSTC3 containing protease TCES gene for transforming Bacillus subtilis 

The above plasmid pTC12 was digested with Kpnl and Sphl, which was subjected to 1% agarose electrophoresis 
to recover the separated about 3 kb DNA fragment. Then, the plasmid vector pUB1 8-P43 was digested with Sacl, which 
was bunt-ended and allowed to self-ligate to give the plasmid vector pUB18-P43S from which the Sacl site had been 
removed. This was digested with Kpnl and Sphl, which was mixed with the previously recovered about 3kb DNA frag- 
ment and allowed to ligate, followed by introduction into Bacillus subtilis DB104. Plasmids were prepared from the 
resulting kanamycin-resistant transformant, and the plasmid containing only one molecule of the above about 3 kb DNA 
fragment was selected and designated plasmid pSTC2. 

Then, the plasmid pST C2 was digested with Sacl and was subjected to intramolecular ligation, followed by intro- 
duction into Bacillus subtilis DB104. Plasmids were prepared from the resulting kanamycin-resistant transformant, the 
plasmid containing only one Sacl site and designated plasmid pSTC3. 

Then. Bacillus subtilis DB104 harbouring the plasmid pSTC3 was designated Bacillus subtilis DB104/pSTC3. 

Fig. 1 0 shows a restriction map of the plasmid pSTC3. 

Example 2 

(1) Preparation of chromosomal DNA of Pvrococcus furiosus 

Pvrococcus furiosus DMS3638 was cultured as follows. A medium having the composition of 1% trypton, 0.5% 
yeast extract, 1% soluble starch, 3.5% Jamarin S • Solid (manufactured by Jamarin Laboratory), 0.5% Jamarin S • Liq- 
uid (manufactured by Jamarin Laboratory). 0.003% MgS0 4 , 0.001% NaCI, 0.0001% FeS0 4 * 7H 2 O t 0.0001% C0SO4, 
0.0001% CaCI 2 *7H 2 0, 0.0001% ZnS0 4 , 0.1 ppm CuS0 4 -5H 2 0. 0.1 ppm H3BO3. 0.1 ppm KAI(S0 4 ) 2 , 0.1 ppm 
Na 2 Mo0 4 • 2H 2 O t 0.25 ppm NiCI 2 • H 2 0 was placed in a 2 liter medium bottle, and was sterilized at 120 °C for 20 min- 
utes, nitrogen gas was blown into the medium to purge out the dissolved oxygen, and the above bacterial strain was 
inoculated into the medium, followed by subjecting to stationarily culture at 95 °C for 16 hours. After the completion of 
cultivation, the cells were collected by centrifugation. 

Then, the resulting cells were suspended in 4 ml of 50 mM Tris-HCI (pH 8.0) containing 25% sucrose, to this sus- 
pension was added 2 ml of 0.2 M EDTA and 0.8 ml of lysozyme (5 mg/ml) and incubated at 20 °C for 1 hour, 24 ml of a 
SET solution (150 mM NaCI, 1 mM EDTA, 20mM Tris-HCI, pH 8.0), 4 ml of 5% SDS and 400 ul of proteinase K (10 
mg/ml) were added thereto and incubated at 37 °C for another 1 hour. The reaction was stopped by extraction with phe- 
nol-chloroform, followed by ethanol precipitation to obtain about 3.2 mg of the chromosomal DNA. 

(2) Genomic southern hybridization of Pvrococcus furiosus chromosomal DNA 

A chromosomal DNA of Pvrococcus furiosus was digested with Sacl, Notl, Xbal, EcoRI and Xhol (all manufactured 
by Takara Shuzo Co., Ltd.), respectively. An aliquot of the reaction mixture was further digested with Sacl and EcoRI, 
which was subjected to 1% agarose gel electrophoresis, followed by southern hybridization according to the procedures 
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described in Example 1 -(5). A 32 P-labeled DNA, which was prepared using an about 0.3 kb DNA fragment obtained by 
digesting the above plasmid p1 F-2R(2) with EcoRI and Pstl as a template and using BcaBEST DNA Labeling kit (man- 
ufacture by Takara Shuzd Co., Ltd.) and [a- 32 P]dCTP, was used as a probe. A membrane was washed in 2 x SSC con- 
taining SDS to the final concentration of 0.5% at room temperature, rinsed with 2 x SSC and the autoradiogram was 

5 obtained. As a result, a signal was observed in two DNA fragments of about 5.4 kb and about 3.0 kb produced by 
digesting a Pvrococcus furiosus chromosomal DNA with Sac! and it was indicated that a protease gene was present on 
respective fragments. When the Sacl-digested fragment was further digested with Spel (manufactured by Takara Shuzo 
Co., Ltd.), the signal of the above about 5.4 kb fragment did not show the change but the signal which had been seen 
in the about 3.0 kb fragment was lost, and a signal was newly observed in the about 0.6 kb fragment. Since the Spel 

10 site is not present in the protease PFUL gene represented by SEQ ID No.. 7 of the Sequence Listing, it was suggested 
that a signal on the about 0.6 kb fragment obtained by the digestion with Sacl and Spel was derived from a novel hyper- 
thermostable protease (hereinafter referred to as "protease PFUS"). In addition, regarding the products from digestion 
of Pyrococcus furiosus chromosomal DNA with Xbal, a signal was observed on two DNA fragments of about 3.3 kb and 
about 9.0 kb. From a restriction map of protease PFUL gene shown in Fig. 1, it was presumed that the about 3.3 kb 

15 fragment contained the protease PFUL gene and the about 9.0 kb fragment contained the protease PFUS gene. When 
the above chromosomal DNA was digested with Xbal and Sacl, a signal was observed on the about 2.0 kb fragment 
and the about 3.0 kb fragment. From the positions of the Sacl and Xbal cleavage sites present on the protease PFUL 
gene shown in SEQ ID No. 7 of the Sequence Listing, it was presumed that the protease PFUL gene is present on the 
about 2.0 kb Sacl-Xbal fragment. On the other hand, it was presumed that the protease PFUS gene was present on the 

20 about 3.0 kb fragment. Combining with the results on the digestion with Sacl, it was shown that no Xbal site is present 
on the about 3.0 kb DNA fragment obtained by the digestion with Sacl alone. 

(3) Cloning of 0.6 kb Spel-SacI fragment containing protease PFUS gene 

25 Achromosomal DNA of Pvrococcus furiosus was digested with Sacl and Spel, which was subjected to 1%agarose 
gel electrophoresis to recover the DNA fragment corresponding to about 0.6 kb from the gel. Then, the plasmid pBlue- 
script SK(-) (manufactured by Stratagene) was digested with Sacl and Spel, which was mixed with the about 0.6 kb 
DNA fragment to allow to ligate, followed by introduction into Escherichia coli JM109 to obtain the plasmid library con- 
taining the chromosomal DNA fragments. Transformed Escherichia coli JM 109 was seeded on a plate to form the col- 

30 onies, and the produced colonies were transferred to a Hybond-N+ membrane, which was incubated at 37 °G for about 
. 2 hours on a new LB plate. This membrane was treated with 0.5N NaOH containing 1 .,5M NaCI. then with 0.5M Tris-HCI 
(pH 7.5) containing 1 .5 M NaCI, washed with 2 x SSC, air dried and the plasmid DNA was fixed to the membrane by 
irradiating with ultraviolet rays on a UV transilluminator. This membrane was treated at 50 °C for 2 hours in a hybridiza- 
tion buffer, and transferred to the same buffer containing a 32 P-labeled DNA probe used for southern hybridization 

35 described in Example 2-(2), to hybridize at 50 °C for 1 8 hours. After the completion of hybridization, the membrane was 
washed in 2 x SSC containing 0.5% SDS at room temperature, and washed at 37 °C. Further, the membrane was 
rinsed with 2 x SSC, air dried, exposed to a X-ray film at -80 °C for 12 hours to obtain an autoradiogram. About 500 
clones were screened and, as a result, 3 clones containing a protease gene were obtained. From a signal on the auto- 
radiogram, the positions of these clones were examined and the corresponding colonies on the plate used for transfer 

40 to the membrane were isolated in LB medium. 

(4) Detection of protease PFUS gene by PCR 

Oligonucleotides which used for detection of a hyperthermostable protease gene by PCR as a probe were 
45 designed based on the nucleotide sequences encoding two regions having the high homology with the amino acid 
sequences of alkaline serine proteases derived from the known microorganisms in the protease PFUL gene! Based on 
the amino acid sequence of the protease PFUL represented by Figs. 2 and 3, the primers 1 FP1 >t 1 FP2, 2RP1 and 2RP2 
were synthesized. SEQ ID Nos. 22, 23, 24 and 25 of the Sequence Listing show the nucleotide sequences of the oligo- 
nucleotides 1FP1, 1FP2, 2RP1 and 2RP2. 
so PCR reaction mixtures containing the plasmids prepared from the above three clones as well as the oligonucle- 
otides 1 FP1 and 2RP1 , or 1 FP1 and 2RP2, or 1 FP2 and 2RP1 , or 1 FP2 and 2RP2 were prepared, and a 30 cycle reac- 
tion was carried out, each cycle consisting of 94 °C for 30 seconds - 37 °C for 2 minutes -72 °C for 1 minute. It was 
shown that, when aliquots of these reaction mixtures were subjected to agarose gel electrophoresis, respectively, the 
amplification of an about 150 bp DNA fragment was observed in all the three above plasmids when used the primers 
55 1 FP2 and 2RP2, indicating that a protease gene was present on these plasmids. 
One of the above three clones was selected and designated plasmid pSS3. 
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(5) Determination of nucleotide sequence of protease PFUS gene contained in plasmid pSS3 

The nucleotide sequence of the inserted DNA fragment in the plasmid was determined by the dideoxy method 
using the plasmid pSS3 as a template and using the primer M4 and the primer RV (both manufactured by Takara Shuzo 
Co., Ltd.). SEQ ID No. 26 of the Sequence Listing shows the resultant nucleotide sequence and the amino acid 
sequence which was deduced to be encoded by the nucleotide sequence. By comparing the amino acid sequence with 
that of the protease PFUL, the protease TCES and subtilisin, it was presumed that the DNA fragment inserted in the 
plasmid pSS3 encoded the amino acid sequence having the homology with these proteases. 

(6) Cloning of N-terminal coding region and C-terminal coding region of protease PFUS by inverse PCR method 

In order to obtain genes encoding N-terminal amino acid sequence and C-terminal one of the protease PFUS, the 
inverse PCR was carried out. A primer used for the inverse PCR was synthesized based on the nucleotide sequence 
of the inserted DNA fragment in the plasmid pSS3. SEQ ID Nos. 27, 28 and 29 of the Sequence Listing show the nucle- 
otide sequences of the primers NPF-1 , NPF-2 and NPR-3. 

A chromosomal DNA of Pyrococcus furiosus was digested with Sacl and Xbal and was subjected to intramolecular 
ligation. PCR mixtures containing an aliquot of the ligation reaction mixture and the primers NPF-1 and NPR-3, or NPF- 

2 and NPR-3 were prepared and a 30 cycle reaction was carried out, each cycle consisting of 94 °C for 30 seconds - 
67 °C for 10 minutes. When an aliquot of this reaction mixture was subjected to agarose gel electrophoresis, an about 

3 kb amplified fragment was observed in a case of the use of the primers NPF-2 and NPR-3. This amplified fragment 
was recovered from'the agarose gel, and mixed with the plasmid vector pTTBIueT (manufactured by Novagen) to allow 
to ligate, followed by introduction into Escherichia coli JM 1 09. Plasmids were prepared from the resultant transformant , 
the plasmid containing an about 3 kb fragment was selected and designated plasmid pS322. 

On the other hand, an about 9 kb amplified fragment was observed in a case of the use of the primers NPF-1 and 
NPR-3. This amplified fragment was recovered from the agarose gel, the DNA ends were made blunt using a DNA 
blunting kit, followed by further digestion with Xbal. This was mixed with the plasmid vector pBluescript SK(-) digested 
with Xbal and Hindi to allow to ligate, followed by introduction into Escherichia coli JM109. Plasmids were prepared 
from the resulting transformant, the plasmid containing an about 5 kb DNA fragment was selected and designated the 
plasmid pSKXS. 

(7) Sequencing of nucleotide sequence of protease PFUS gene contained in plasmid PS322 and pSKX5 

The nucleotide sequence of a gene encoding a N-terminal region of the protease PFUS was determined by the 
dideoxy method using the plasmid pS322 as a template and using the primer NPR-3. SEQ ID No. 30 of the Sequence 
Listing shows a part of the resulting nucleotide sequence and the amino acid sequence deduced to be encoded by the 
nucleotide sequence. 

Further, the nucleotide sequence of a region corresponding to a 3' part of the protease PFUS gene was determined 
by the dideoxy method using the plasmid pSKXS as a template and using the primer RV. SEQ ID No. 31 of the 
Sequence Listing shows a part of the resulting nucleotide sequence. 

(8) Synthesis of primer used for amplification of full length protease PFUS gene 

Based on the nucleotide sequence obtained in Example 2-(7), a primer used for amplification of the full length of 
the protease PFUS gene was designed, Based on the nucleotide sequence encoding a N-terminal part of the protease 
PFUS shown in SEQ ID No. 30 of the Sequence Listing, the primer NPF-4 which can introduce BamHI site in front of 
an initiation cbdon of the protease PFUS gene. SEQ ID No. 32 of the Sequence Listing shows the nucleotide sequence 
of the primer NPF-4. In addition, based on the nucleotide sequence in the vicinity of a 3' region of the protease PFUS 
shown in SEQ ID No. 31 of the Sequence Listing, the primer NPR-4 having a sequence complementary to the nucle- 
otide sequence and a Sphl site was synthesized. SEQ ID No. 33 of the Sequence Listing shows the nucleotide 
sequence of the primer NPR-4. 

(9) Preparation of plasmid pSPT1 containing hybrid gene of protease derived from Pvrococcus furiosus and protease 
TCES, for transformation of Bacillus subtilis 

By using a LA PCR kit (manufactured by Takara Shuzo Co., Ltd.), a PCR reaction mixture (hereinafter a PCR reac- 
tion mixture prepared by using a LA PCR kit is referred to as "LA-PCR reaction mixture") containing the primers NPF- 
4 and NPR-4 and a chromosomal DNA of Pyrococcus furiosus. and a reaction of 30 cycles, each cycle consisting of 94 
°C for 20 seconds - 55 °C for 1 minute - 68 °C for 7 minutes, was carried out to amplify an about 6 kb DNA fragment 
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having these two primers on both ends and containing the coding region of the protease PFUS gene. 

The about 6 kb DNA fragment was digested with BamHI and Sad, which was subjected to 1% agarose gel electro- 
phoresis to recover the separated about 0.8 kb DNA fragment. This fragment was mixed with the plasmid pSTC3 
digested with BamHI and Sad to allow to ligate, followed by introduction into Bacillus subtilis DB104. Plasmids were 
5 prepared from the resultant kanamycin-resistant transformant, and the plasmid containing only one molecule of the 
above 0.8 kb fragment was selected and designated the plasmid pSPT1 . 

Bacillus subtilis DB104 harboring the plasmid pSPT1 was designated Bacillus subtilis DB104/pSTPl. 

Fig. 1 4 shows a restriction map of the plasmid pSPT1 . 

10 (10) Preparation of plasmid pSNP1 containing protease PFUS gene for transformation of Bacillus subtilis 

The about 6 kb DNA fragment amplified in Example 2-(9) was digested with Spel and Sphl, which was subjected 
to 1% agarose gel electrophoresis to recover the separated about 5.7 kb DNA fragment This was mixed with the plas- 
mid digested with Spel and Sphl to allow to ligate, followed by introduction into Bacillus subtilis DB104. Plasmids were 
is prepared from the resulting kanamycin-resistant transformant, and the plasmid containing only one molecule of the 5.7 
kb fragment was selected and designated the plasmid pSNP1. Bacillus subtilis transformed with the plasmid pSNP1 
was designated as Bacillus subtilis DB104/pSNP1. 

Fig. 1 5 shows a restriction map of the plasmid pSNP1 . 

20 (1 1) Determination of nucleotide sequence of protease PFUS gene contained in plasmid pSNP1 

An about 6 kb DNA fragment containing a protease gene inserted into the plasmid pSNP1 was fragmented into 
appropriate size with a variety of restriction enzymes, and the fragments were subcloned into the plasmid vector 
pUC119 or pBluescript SK(-). The nucleotide sequence was determined by the dideoxy method using the resulting 
25 recombinant plasmid as a template and using a commercially available universal primer. Regarding a part from which 
the fragments having appropriate size could not be obtained, the primer walking method was used utilizing the synthetic 
primers. The nucleotide sequence of an open reading frame present in the nucleotide sequence of the DNA fragment 
inserted into the plasmid pSNP1 thus determined, and the amino acid sequence of a protease derived from Pvrococcus 
furiosus deduced from the nucleotide sequence are shown in SEQ ID Nos. 34 and 35, respectively. 

.30 . . . 

(12) Synthesis of primer for iamplification of protease PFUS gene 

In order to design a primer, which is used for amplification of the full length protease PFUS gene and hybridizes to 
a 3* part of the gene, the nucleotide sequence of the 3* part of the gene was determined. First, an about 0.6 kb DNA 

35 fragment containing the 3' region of the protease PFUS gene, obtained by digestion of the plasmid pSNPt with BamHI, 
was ligated with the plasmid vector pUC1 19 which had been digested with BamHI and dephosphorylated with alkaline 
phosphatase. The resulting recombinant plasmid was designated plasmid pSNPD and the nucleotide sequence of a 
region corresponding to the 3' part of the protease PFUS gene was determined by the dideoxy method using this as a 
template. SEQ ID No. 38 of the Sequence Listing shows the nucleotide sequence, from the BamHI site to 80 bp 

4d upstream nucleotide, present in the region (the sequence of the complementary chain). Then, based on the sequence, 
the primer NPM-1 which hybridizes to a 3' part of the protease PFUS gene and contains a Sphl site was synthesized. 
SEQ ID No. 39 of the Sequence Listing shows the nucleotide sequence of the primer NPM-1 . 

In addition, the primers mutRR and mutFR for elimination the BamHI sites which are present about 1.7 kb down- 
stream from an initiation codon within the protease PFUS gene were synthesized. SEQ ID Nos. 40 and 41 of the 

45 Sequence Listing show the nucleotide sequences of the primers mutRR and mutFR, respectively. 

(13) Preparation of plasmid pPS1 containing full length protease PUFS gene 

Two sets of LA-PCR reaction mixtures containing Pvrococcus furiosus chromosomal DNA as a template and a 
so combination of the primers NPF-4 and mutRR or a combination of the primers mutFR and NPM-1 were prepared, and 
a reaction of 30 cycles, each cycle consisting of 94 °C for 30 seconds - 55 °C for 1 minute - 68 °C for 3 minutes, was 
carried out. When agarose gel electrophoresis was carried out using an aliquot of this reaction mixture, an about 1 .8 to 
DNA fragment was amplified in a case of the use of the primer NPF-4 and mutRR, and an about 0.6 kb DNA fragment 
in a case of the use of the primers mutFR and NMP-1 . 
55 Each amplified DNA fragment from which the primers had been removed by using SUPREC-02 (manufactured by 
Takara Shuzo Co., Ltd.) was prepared from the two set of the PCR mixture. A LA-PCR reaction mixture containing both 
of these amplified DNA fragments and not containing the primers and LA Taq was prepared, which was used to carry 
out heat denaturation at 94 °C for 10 minutes, followed by cooling to 30 °C over 30 minutes and maintaining at 30 °C 
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for 15 minutes to form a hetero duplex. Then, to this reaction mixture, LA Taq was added and was incubated at 72 °C 
for 3 minutes, the primers NPF-4 and NPM-1 were added thereto and a reaction of 25 cycles, each cycle consisting of 
94 °C for 30 seconds - 55 °C for 1 minute - 68 °C for 3 minutes, was carried out. Amplification of an about 2.4 kb DNA 
fragment was observed in this reaction mixture. 

The about 2.4 kb DNA fragment was digested with BamHI and Sphl, the fragments were mixed with the plasmid 
pSNP1 , described in Example 2 -(11), from which the full length protease PFUS gene had been removed previously by 
digestion with BamHI and Sphl, to allow to ligate, followed by introduction into Bacillus subtilis DB104. Plasmids were 
prepared from the resulting kanamycin-resistant transformant, and the plasmid with only one molecule of the about 2.4 
kb fragment inserted was selected and designated plasmid pPS1 . Bacillus subtilis DB1 04 transformed with the plasmid 
DB1 04 was designated BasillyS SiJblilia DB1 04/pPS1 . 

Fig. 1 6 shows a restriction map of the plasmid pPS1 . 

(14) Amplification of DNA fragment of a region from the promoter to the signal sequence of subtilisin gene 

A primer for obtaining a region from promoter to signal sequence of subtilisin gene was synthesized. First, with ref- 
erence to the nucleotide sequence of a promoter region of subtilisin gene described in J. Bacterid., volume 171, page 
2657-2665 (1989), the primer SUB4 which hybridizes to a part upstream of the region and contains the EcoRI site was 
synthesized (SEQ ID No. 36 of the Sequence Listing shows the nucleotide sequence of the primer SUB4). Then, with 
reference to the nucleotide sequence of a region encoding subtilisin described in J. Bacteriol., volume 158, page 41 1 - 
418 (1984), the primer BmR1 which can be introduce the BamHI site just behind the signal sequence was synthesized 
(SEQ ID No. 37 of the Sequence Listing shows the nucleotide sequence of the primer BmR1). 

The plasmid pKWZ containing subtilisin gene described in J. Bacteriol., volume 17, page 2657-2665 (1989) was 
used as a template to prepare a PCR reaction mixture containing the primers SUB4 and BmR1 , and a reaction of 30 
cycles, each cycle consisting of 94 °C for 30 seconds - 55 °C for 1 minute - 68 °C for 2 minutes, was carried out. Aga- 
rose gel electrophoresis of an aliquot of this reaction mixture confirmed amplification of an about 0.3 kb DNA fragment. 

(15) Preparation of plasmid pNAPSI containing protease PFUS gene for transformation of Bacillus subtilis 

The about 0.3 kb DNA fragment was digested with EcoRI and BamHI, which was mixed with the plasmid pPS1, 
described in Example 2-(13), which previously had been digested with EcoRI and BamHI to allow to ligate, followed by 
introduction into Bacillus subtilis DB104. Plasmids were prepared from the resulting kanamycin-resistant transformant 
and the plasmid containing only one molecule of the about 0.3 kb fragment was selected and designated the plasmid 
pNAPSI. In addition, Bacillus subtilis DB104 transformed with the plasmid pNAPSI was designated Bacillus subtilis 
DB104/fc>NAPS1. 

Fig. 1 7 shows a restriction map of the plasmid pNAPSI . 

Example 3 

(1) Preparation of probe for detecting hyperthermostable protease gene 

The plasmid pTPR12 containing the protease PFUL gene was digested with Ball and Hindi (both manufactured by 
Takara Shuzo Co., Ltd.), which was subjected to 1% agarose gel electrophoresis to recover the separated about 1 kb 
DNA fragment. A 32 P-labeled DNA probe was prepared using the DNA fragment as a template and using BcaBEST 
DNA labeling kit and [a- 32 P] dCTP. 

(2) Detection of hyperthermostable protease gene present in hyperthermophile Staphylothermus marinus and Thermo- 
bacterpides proteoliticus 

Chromosomal DNAs were prepared from each 10 ml of cultures of Staphylothermus marinus DSM3639 and Ther- 
mobacteroides proteoliticus DSM5265 obtained from Deutsche Sammlung von Mikroorganismen und Zellkulturen 
GmbH according to the procedures described in Example 1-(3). Both chromosomal DNAs were digested with EcoRI, 
Pstl, Hindlll, Xbal and Sacl, respectively, which were subjected to 1% agarose gel electrophoresis, followed by south- 
ern hybridization according to the procedures described in Example 1 -(5). As a probe, 32 P-labeled DNA probe prepared 
in Example 3-(1) was used. A membrane was washed at 37 °C in 2 x SSC finally containing 0.5% SDS, rinsed with 2 x 
SSC, and the autoradiogram was obtained. From this autoradiogram, a signal was recognized in an about 4.8 kb DNA 
fragment in a case of Staphylothermus marinus chromosomal DNA digested with Pstl, and in an about 3.5 kb DNA frag- 
ment in a case of Thermobacteroides proteoliticus chromosomal DNA digested with Xbal, thus, indicating that a hyper- 
thermostable protease gene which hybridizes with the protease PFUL gene was present in the ftt a P h Y'°f hermus 
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marinus and Thermobacterniries proteoliticus chromosomal DNA. 
Example 4 

(1) Preparation of crude enzyme preparation of protease PFUS and TCES 

Basius sybtitig DB104 in which the plasmid pSTC3 containing the hyperthermostable protease gene of the 
present invention had been introduced (Bacillus subtilis DB104/pSTC3) was cultured in 5 ml of LB medium (trypton 10 
g/lrter, yeast extract 5 g/liter, NaCI 5 g/liter, pH 7.2) containing 10 ng/ml kanamycin at 37 °C for 8 hours 250 ml of the 
similar medium was prepared in 1 liter Erlenmeyer flask, which was inoculated with 5 ml of the above culture to culture 
at 37 °C for 16 hours. Ammonium sulfate was added to a supernatant obtained by centrifugation of the culture to 75% 
saturation, and the resulted precipitates were recovered by centrifugation. The recovered precipitates were suspended 
in 4 ml of 20 mM Tris-HCI. pH 7.5. which was dialyzed against the same buffer, and the resulting dialysate was used as 
crude enzyme preparation (enzyme preparation TC-3). 

Crude enzyme preparations were prepared from Bacillus subtilis DB1 04 in which the plasmid pSNP1 containing 
the hyperthermostable protease gene of the present invention was introduced (Bacillus subtilis DB1 04/pSNP1 ) or Bacil- 
!us subJis DB104 in which the plasmid pSPT1 containing the hyperthermostable protease of the present invention 
according to the procedures described above, and the preparations were designated NP-1 and PT-1 . respectively. 

These enzyme preparations were used to examine the protease activity by the enzyme activity detecting method 
using the SDS-polyacrylamide gel containing gelatin or by the other activity detecting methods. 

(2) Preparation of purified enzyme preparation of protease PFUS 

Two tubes containing 5 ml of LB medium containing 10 nVm\ kanamycin were inoculated with Bacillus subtilis 
DB104 in which the plasmid pNAPSl containing the hyperthermostable protease gene of the present invertfon" 
obtained in Example 2-(18) was introduced (Bacillus subtilis DB104/ P NAPS1), followed by cultivation at 37 °C for 7 
hours w.th shaking. Six Erlenmeyer flasks of 500 ml volume, each containing 120 ml of the similar medium were pre- 
pared, and each flask was inoculated with 1 ml of the above culture, followed by cultivation at 37 °C for 1 7 hours with 
shaking. The culture was centrifuged to obtain the cells and a culture supernatant. 

The cells were suspended in 15 ml of 50 mM Tris-HCI. pH 7.5. and 30 mg of lysozyme (manufactured by Sigma) 
was added thereto, followed by digestion at 37 "C for 15 hours. The digestion solution was heat-treated at 95 °C for 1 5 
minutes, followed by centrifugation to collect a supernatant. To 12 ml of the resulting supernatant was added 4 ml of an 
saturated ammonium sulfate solution, which was filtrated using 0.45 jim filter unit (Sterivex HV, manufactured by Milli- 
pore). and the filtrate was loaded onto the POROS PH column (4.6 mm x 150 mm: manufactured by PerSeptive Bio- 
systems) equilibrated with 25 mM Tris-HCI. pH 7.5 containing ammonium sulfate at 25% saturation. The column was 
washed with the buffer used for equilibration, the gradient elution was performed by lowering the concentration of 
ammonium sulfate from 25% saturation to 0% saturation and at the same time increasing the concentration of ace- 
tonitnle from 0% to 20% to elute the PFUS protease, to obtain the purified enzyme preparation NAPS-1 

750 ml of the culture supernatant was dialyzed against 25 mM Tris-HCI. pH 8.0 and adsorbed onto Econo-Pack Q 
cartridge (manufactured by BioRad) equilibrated with the same buffer. Then, the adsorbed enzyme was eluted with a 
linear gradient of 0 to 1 .5 M NaCI. The resulting active fraction was heat-treated at 95 °C for 1 hour, and an 1/3 volume 
of a saturated ammonium sulfate solution was added thereto. After the filtration was carried out using a 0 45 urn filter 
unit (Sterivex HV), the filtrate was loaded onto the POROS PH column (4.6 mm x 150 mm) equilibrated with 25 mM Tris- 
HCI, pH 7.5 containing ammonium sulfate at 25% saturation. The PFUS protease absorbed onto the column was eluted 
according to the procedures as in the enzyme preparation NAPS-1 to obtain the purified enzyme preparation NAPS-1 

To an appropriate amount of the purified enzyme preparation NAPS-1 or NAPS-1 S was added trichloroacetic acid 
to the final concentration of 8.3% to precipitate the proteins in the enzyme preparation, which were recovered by cen- 
trifugation. The recovered precipitated protein were dissolved in a distilled water, an 1/4 amount of a sample buffer (50 
mM Tris-HCI. pH 7.5, 5% SDS. 5% 2-mercaptoethanol, 0.005% Bromophenol Blue. 50% glycerol) was added thereto 
which was treated at 100 °C for 5 minutes and subjected to electrophoresis using 0.1% SDS-10% polyacrylamide gel 
After run, the gel was stained in 2.5% Coomassie Brilliant Blue R-250. 25% ethanol. and 10% acetic acid for 30 min- 
utes, transferred in 25% methanol, and 7% acetic acid and the excess dye was removed over 3 to 15 hours Both 
enzyme preparations NAPS-1 and NAPS-1 S showed a single band, and a molecular weight deduced from migrated 
distance was about 4.5 kDa. 

(3) Sequencing of N-terminal of mature protease PUFS 

The purified enzyme preparation NAPS-1 prepared in Example 4-(2) was subjected to electrophoresis using 0.1% 
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SDS-10% polyacrylamide gel, and the proteins on the gel was blotted onto a PVDF membrane (manufactured by Milli- 
pore) using Semidry Blotter (manufactured by Nihon Eido). Blotting was carried out according to a method described in 
Electrophoresis, volume 11, page 573-580 (1990). After blotting, the membrane was stained with, a solution of 1% 
Coomassie Brilliant Blue R-250, in 50% methanol, and destained with a 60% methanol solution. A part of the mem- 
brane which had been stained was cut off, followed by sequencing of the N-terminal amino acid sequence by the auto- 
mated Edman degradation using G1000A protein sequencer (manufactured by Hewlette Packard). SEQ ID No. 42 
shows the resultant N-terminal amino acid sequence. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: Takara Shuzo Co., Ltd. 

(B) STREET: 609, Takenaka-cho, Fushimi-ku 

(C) CITY: Kyoto-shi, Kyoto 

(E) COUNTRY: Japan 

(F) ZIP: 612 

(ii) TITLE OF INVENTION: Hyperthermostable protease gene 

(iii) NUMBER OF SEQUENCES: 42 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: 3.5" Diskette, 1.44 Mb 

(B) COMPUTER: IBM PS/2 Model 50Z or 55SX 

(C) OPERATING SYSTEM: MS-DOS (Version 5.0) 

(D) SOFTWARE: Microsoft Word 

(v) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: EP 96937514.6 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: PCT/ JP96/03253 

(B) FILING DATE: 7. November 1996 



(2) INFORMATION FOR SEQ ID NO:l 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 659 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

Met Lys Arg Leu Gly Ala Val Val Leu Ala Leu Val Leu Val Gly 

5 10 15 

Leu Leu Ala Gly Thr Ala Leu Ala Ala Pro Val Lys Pro Val Val 

20 25 " 30 

Arg Asn Asn Ala Val Gin Gin Lys Asn Tyr Gly Leu Leu. Thr Pro 

35 40 45 

Gly Leu Phe Lys Lys Val Gin Arg Met Asn Trp Asn Gin Glu Val 

50 .. 55 . 60 

Asp Thr Val lie Met Phe Gly Ser Tyr Gly Asp Arg Asp Arg Ala 

65 70 75 

Val Lys Val Leu Arg Leu Met Gly Ala Gin Val Lys Tyr Ser Tyr 
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80 85 90 

Lys lie lie Pro Ala Val Ala Val Lys lie Lys Ala Arg Asp Leu 

95 100 105 

Leu Leu lie Ala Gly Met lie Asp Thr Gly Tyr Phe Gly Asn Thr 

110 115 120 

Arg Val Ser Gly lie Lys Phe lie Gin Glu Asp Tyr Lys Val Gin 
.125 .130 135 

Val Asp Asp Ala Thr Ser Val Ser Gin lie Gly Ala Asp Thr Val 

140 145 150 

Trp Asn Ser Leu Gly Tyr Asp Gly Ser Gly Val Val Val Ala lie 

155 160 165 

Val Asp Thr Gly lie Asp Ala Asn His Pro Asp Leu Lys Gly Lys 

170 175 180 

Val lie Gly Trp Tyr Asp Ala Val Asn Gly Arg Ser Thr Pro Tyr 

185 190 195 

Asp Asp Gin Gly His Gly Thr His Val Ala Gly He Val Ala Gly 

200 205 210 

Thr Gly Ser Val Asn Ser Gin Tyr He Gly Val Ala Pro Gly Ala 

215 220 225 

Lys Leu Val Gly Val Lys Val Leu Gly Ala Asp Gly Ser Gly Ser 

230 235 240 

Val Ser Thr He He Ala Gly Val Asp Trp Val Val Gin Asn Lys 

245 250 255 

Asp Lys Tyr Gly He Arg Val He Asn Leu Ser Leu Gly Ser Ser 

260 265 270 

Gin Ser Ser Asp Gly Thr Asp Ser Leu Ser Gin Ala Val Asn Asn 

275 280 285 

Ala Trp Asp Ala Gly lie Val Val Cys Val Ala Ala Gly Asn Ser 

290 295 300 

Gly Pro Asn Thr Tyr Thr Val Gly Ser Pro Ala Ala Ala Ser Lys 

305 310 315 

Val He Thr Val Gly Ala Val Asp Ser Asn Asp Asn He Ala Ser 

320 325 330 

Phe Ser Ser Arg Gly Pro Thr Ala Asp Gly Arg Leu Lys Pro Glu 

335 340 345 

Val Val Ala Pro Gly Val Asp He He Ala Pro Arg Ala Ser Gly 

350 355 360 

Thr Ser Met Gly Thr Pro He Asn Asp Tyr Tyr Thr Lys Ala Ser 

365 370 375 

Gly Thr Ser Met Ala Thr Pro His Val Ser Gly Val Gly Ala Leu 

380 385 : 390 

lie Leu Gin Ala His Pro Ser Trp Thr Pro Asp. Lys Val Lys Thr 

395 400 405 

Ala Leu He Glu Thr Ala Asp He Val Ala Pro Lys Glu He Ala 

410 415 420 

Asp He Ala Tyr Gly Ala Gly Arg Val Asn Val Tyr Lys Ala He 
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425 430 435 

Lys Tyr Asp Asp Tyr Ala Lys Leu Thr Phe Thr Gly Ser Val Ala 
5 ' "* A 440 445 450 

Asp Lys Gly Ser Ala Thr His Thr Phe Asp Val Ser Gly Ala Thr 
455 460 465 

Phe Val Thr Ala Thr Leu Tyr Trp Asp Thr Gly Ser Ser Asp lie 
470 475 480 

Asp Leu Tyr Leu Tyr Asp Pro Asn Gly Asn Glu Val Asp Tyr Ser 
10 ~ 485 490 495 

Tyr Thr Ala Tyr Tyr Gly Phe Glu Lys Val Gly Tyr Tyr Asn Pro 
500 505 510 

Thr Ala Gly Thr Trp Thr Val Lys Val Val Ser Tyr Lys Gly Ala 
515 520 525 

Ala Asn Tyr Gin Val Asp Val Val Ser Asp Gly Ser Leu Ser Gin 
530 535 540 

Ser Gly Gly Gly Asn Pro Asn Pro Asn Pro Asn Pro Asn Pro Thr 
545 550 555 

Pro Thr Thr Asp Thr Gin Thr Phe Thr Gly Ser Val Asn Asp Tyr 
560 565 570 

20 Trp Asp Thr Ser Asp Thr Phe Thr Met Asn Val Asn Ser Gly Ala 

575 580 585 

Thr Lys lie Thr Gly Asp Leu Thr Phe Asp Thr Ser Tyr Asn Asp 
590 595 600 

Leu Asp Leu Tyr Leu Tyr Asp Pro Asn Gly Asn Leu Val Asp Arg 
605 610 615 

Ser Thr Ser Ser Asn Ser Tyr Glu His Val Glu Tyr Ala Asn Pro 
620 625 630 

Ala Pro Gly Thr Trp Thr Phe Leu Val Tyr Ala Tyr Ser Thr Tyr 
635 640 645 

Gly Trp Ala Asp Tyr Gin Leu Lys Ala Val Val Tyr Tyr Gly 
30 650 . 655 

(2) INFORMATION FOR SEQ ID NO: 2 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 1977 

35 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: Thertnococcus celer 

(B) STRAIN: DSM2476 

<0 ( X i) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

ATGAAGAGGT TAGGTGCTGT GGTGCTGGCA CTGGTGCTCG TGGGTCTTCT GGCCGGAACG 60 

GCCCTTGCGG CACCCGTAAA ACCGGTTGTC AGGAACAACG CGGTTCAGCA GAAGAACTAC 120 

GGACTGCTGA CCCCGGGACT GTTCAAGAAA GTCCAGAGGA TGAACTGGAA CCAGGAAGTG 180 

GACACCGTCA TAATGTTCGG GAGCTACGGA GACAGGGACA GGGCGGTTAA GGTACTGAGG 240 

45 



25 



50 



55 
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CTCATGGGCG CCCAGGTCAA GTACTCCTAC AAGATAATCC CTGCTGTCGC GGTTAAAATA 300 

AAGGCCAGGG ACCTTCTGCT GATCGCGGGC ATGATAGACA CGGGTTACTT CGGTAACACA 360 

5 AGGGTCTCGG GCATAAAGTT CATACAGGAG GATTACAAGG TTCAGGTTGA CGACGCCACT 420 

TCCGTCTCCC AGATAGGGGC CGATACCGTC TGGAACTCCC TCGGCTACGA CGGAAGCGGT 480 

GTGGTGGTTG CCATCGTCGA TACGGGTATA GACGCGAACC ACCCCGATCT GAAGGGCAAG 540 

GTCATAGGCT GGTACGACGC CGTCAACGGC AGGTCGACCC CCTACGATGA CCAGGGACAC 600 

GGAACCCACG TTGCGGGTAT CGTTGCCGGA ACCGGCAGCG TTAACTCCCA GTACATAGGC 660 

GTCGCCCCCG GCGCGAAGCT CGTCGGCGTC AAGGTTCTCG GTGCCGACGG TTCGGGAAGC 720 

10 GTCTCCACCA TCATCGCGGG TGTTGACTGG GTCGTCCAGA ACAAGGACAA GTACGGGATA 780 

AGGGTCATCA ACCTCTCCCT CGGCTCCTCC CAGAGCTCCG ACGGAACCGA CTCCCTCAGT 840 

CAGGCCGTCA ACAACGCCTG GGACGCCGGT ATAGTAGTCT GCGTCGCCGC CGGCAACAGC 900 

GGGCCGAACA CCTACACCGT CGGCTCACCC GCCGCCGCGA GCAAGGTCAT AACCGTCGGT 960 

GCAGTTGACA GCAACGACAA CATCGCCAGC TTCTCCAGCA GGGGACCGAC CGCGGACGGA 1020 

15 AGGCTCAAGC CGGAAGTCGT CGCCCCCGGC GTTGACATCA TAGCCCCGCG CGCCAGCGGA 1080 

ACCAGCATGG GCACCCCGAT AAACGACTAC TACACCAAGG CCTCTGGAAC CAGCATGGCC 1140 

ACCCCGCACG TTTCGGGCGT TGGCGCGCTC ATCCTCCAGG CCCACCCGAG CTGGACCCCG 1200 

GACAAGGTGA AGACCGCCCT CATCGAGACC GCCGACATAG TCGCCCCCAA GGAGATAGCG 1260 

GACATCGCCT ACGGTGCGGG TAGGGTGAAC GTCTACAAGG CCATCAAGTA CGACGACTAC 1320 

GCCAAGCTCA CCTTCACCGG CTCCGTCGCC GACAAGGGAA GCGCCACCCA CACCTTCGAC 1380 

20 GTCAGCGGCG CCACCTTCGT GACCGCCACC CTCTACTGGG ACACGGGCTC GAGCGACATC 1440 

GACCTCTACC TCTACGACCC CAACGGGAAC GAGGTTGACT ACTCCTACAC CGCCTACTAC 1500 

GGCTTCGAGA AGGTCGGCTA CTACAACCCG ACCGCCGGAA CCTGGACGGT CAAGGTCGTC 1560 

AGCTACAAGG GCGCGGCGAA CTACCAGGTC GACGTCGTCA GCGACGGGAG CCTCAGCCAG 1620 

TCCGGCGGCG GCAACCCGAA TCCAAACCCC AACCCGAACC CAACCCCGAC CACCGACACC 1680 

25 CAGACCTTCA CCGGTTCCGT TAACGACTAC TGGGACACCA GCGACACCTT CACCATGAAC 1740 

GTCAACAGCG GTGCCACCAA GATAACCGGT GACCTGACCT TCGATACTTC CTACAACGAC 1800 

. CTCGACCTCT ACCTCtACGA CCCCAACGGC AACCTCGTTG ACAGGTCCAC GTCGAGCAAC 1860 

AGCTACGAGC ACGTCGAGTA CGCCAACCCC GCCCCGGGAA CCTGGACGTT CCTCGTCTAC 1920 

GCCTACAGCA CCTACGGCTG GGCGGACTAC CAGCTCAAGG CCGTCGTCTA CTACGGG 1977 

30 (2) INFORMATION FOR SEQ ID NO: 3 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 522 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(ix) FEATURE: Xaa at 428 position is Gly or Val. 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Ala Glu Leu Glu Gly Leu Asp Glu Ser Ala Ala Gin Val Met Ala 
5 10 151 

40 T hr Tyr Val Trp Asn Leu Gly Tyr Asp Gly Ser Gly lie Thr lie 

20 25 30 

Gly lie lie Asp Thr Gly lie Asp Ala Ser His Pro Asp Leu Gin 
35 40 45 

Gly Lys Val lie Gly Trp Val Asp Phe Val Asn Gly Arg Ser Tyr 

45 
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Pro Tyr Asp 
Ala Gly Thr 
Pro Gly Ala 
Ser Gly Ser 
Asp Asn Lys 
Gly Ser Ser 
Val Asn Ala 
Gly Asn Ser 
Ala Ser Lys 
lie Thr* Ser 
Lys Pro Glu 
Ala Ser Gly 
Ala Ala Pro 
Ala Ala Leu 
Val Lys Thr 
Glu lie Ala 
Lys Ala lie 
Tyr Val Ala 
Gly Ala Ser 
Ser Asp Leu 
Asp Tyr Ser 
Tyr Asn Pro 
Ser Gly Ser 



50 

Asp His Gly 
65 

Gly Ala Ala 
80 

Lys Leu Ala 
95 

lie Ser Thr 

110 
Asp Lys Tyr 

125 
Gin Ser Ser 

140 

Ala Trp Asp 

155 
Gly Pro Asn 

170 
Val He Thr 

185 
Phe Ser Ser 

200 
Val Val Ala 

215 
Thr Ser Met 

230 
Gly Thr Ser 

245 
Leu Leu Gin 

260 
Ala Leu He 

275 
Asp He Ala 

290 
Asn Tyr Asp 

305 
Asn Lys Gly 

320 
Phe Val Thr 

335 
Asp Leu Tyr 

350 
Tyr Thr Ala 

365 
Thr Asp Gly 

380 
Ala Asn Tyr 



His Gly Thr 
Ser Asn Gly 
Gly He Lys 
He He Lys 
Gly lie Lys 
Asp Gly Thr 
Ala Gly Leu 
Lys Tyr Thr 
Val Gly Ala 
Arg Gly Pro 
Pro Gly Asn 
Gly Gin Pro 
Met Ala Thr 
Ala His Pro 
Glu Thr Ala 
Tyr Gly Ala 
Asn Tyr Ala 
Ser Gin Thr 
Ala Thr Leu 
Leu Tyr Asp 
Tyr Tyr Gly 
Thr Trp Thr 
Gin Val Asp 
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His 


Val 


Ala 


70 






Lys 


Tyr 


Lys 


85 






Val 


Leu 


Gly 


100 






Gly 


Val 


Glu 


115 






Val 


He 


Asn 


130 






Asp 


Ala 


Leu 


145 






Val 


Val 


Val 


160 






He 


Gly 


Ser 


175 






Val 


Asp 


Lys 


190 






Thr 


Ala 


Asp 


205 






Trp 


lie 


He 


220 






He 


Asn 


Asp 


235 






Pro 


His 


Val 


250 






Ser 


Trp 


Thr 


265 






Asp 


lie 


Val 


280 






Gly 


Arg 


Val 


295 






Lys 


Leu 


Val 


310 






His 


Gin 


Phe 


325 






Tyr 


Trp 


Asp 


340 






Pro 


Asn 


Gly 


355 






Phe 


Glu 


Lys 


370 




He 


Lys 


Val 


385 






Val 


Val 


Ser 







60 


Ser 


He 


Ala 






75 


Gly 


Met 


Ala 






90 


Ala 


Asp 


Gly 






105 


Trp 


Ala 


Val 






120 


Leu 


Ser 


Leu 






135 


Ser 


Gin 


Ala 






150 


Val 


Ala 


Ala 






165 


Pro 


Ala 


Ala 






180 


Tyr 


Asp 


Val 






195 


Gly 


Arg 


Leu 






210 


Ala 


Ala 


Arg 






225 


Tyr 


Tyr 


Thr 






240 


Ala 


Gly 


He 






255 


Pro 


Asp 


Lys 






270 


Lys 


Pro 


Asp 






285 


Asn 


Ala 


Tyr 






300 


Phe 


Thr 


Gly 






315 


Val 


He 


Ser 






330 


Asn 


Ala 


Asn 






345 


Asn 


Gin 


Val 






360 


Val 


Gly 


Tyr 






375 


Val 


Ser 


Tyr 






390 


Asp 


Gly 


Ser 
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395 








400 




405 


Leu 


Ser 


Gin 


Pro 


Gly 
410 


Ser 


Ser 


Pro Ser 


Pro 
415 


Gin Pro Glu 


Pro Thr 
420 


Val 


Asp 


Ala 


Lys 


Thr 


Phe 


Gin 


Xaa Ser 


Asp 


His Tyr Tyr 


Tyr Asp 








425 








430 




435 


Arg 


Ser 


Asp 


Thr 


Phe 


Thr 


Met 


Thr Val 


Asn 


Ser Gly Ala 


Thr Lys 






440 








445 




450 


He 


Thr 


Gly 


Asp 


Leu 


Val 


Phe 


Asp Thr 


Ser 


Tyr His Asp 


Leu Asp 






455 








460 




465 


Leu 


Tyr 


Leu 


Tyr 


Asp 


Pro 


Asn 


Gin Lys 


Leu 


Val Asp Arg 


Ser Glu 






470 








475 




480 


Ser 


Pro 


Asn 


Ser 


Tyr 
485 


Glu 


His 


Val Glu 


Tyr 
490 


Leu Thr Pro 


Ala Pro 
495 


Gly 


Thr 


Trp 


Tyr 


Phe 
500 


Leu 


Val 


Tyr Ala 


Tyr 
505 


Tyr Thr Tyr 


Gly Trp 
510 


Ala 


Tyr 


Tyr 


Glu 


Leu 
515 


Thr 


Ala 


Lys Val 


Tyr 
520 


Tyr Gly 





(2) INFORMATION FOR SEQ ID NO: 4 
(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 1566 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: Pyrococcus furiosus 
(B) STRAIN: DSM3638 . 

(ix) FEATURE: N at 1283 position is G or T. 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

30 . GCAGAATTAG AAGGACTGGA TGAGTCTGCA GCTCAAGTTA TGGCAACTTA CGTTTGGAAC 60 
TTGGGATATG ATGGTTCTGG AATCACAATA GGAATAATTG ACACTGGAAT TGACGCTTCT 120 
CATCCAGATC TCCAAGGAAA AGTAATTGGG TGGGTAGATT TTGTCAATGG TAGGAGTTAT 180 
CCATACGATG ACCATGGACA TGGAACTCAT GTAGCTTCAA TAGCAGCTGG TACTGGAGCA 240 
GCAAGTAATG GCAAGTACAA GGGAATGGCT CCAGGAGCTA AGCTGGCGGG AATTAAGGTT 300 
CTAGGTGCCG ATGGTTCTGG AAGCATATCT ACTATAATTA AGGGAGTTGA GTGGGCCGTT 360 

35 GATAACAAAG ATAAGTACGG AATTAAGGTC ATTAATCTTT CTCTTGGTTC AAGCCAGAGC 420 

TCAGATGGTA CTGACGCTCT AAGTCAGGCT GTTAATGCAG CGTGGGATGC TGGATTAGTT 480 
GTTGTGGTTG CCGCTGGAAA CAGTGGACCT AACAAGTATA CAATCGGTTC TCCAGCAGCT 540 
GCAAGCAAAG TTATTACAGT TGGAGCCGTT GACAAGTATG ATGTTATAAC AAGCTTCTCA 600 
AGCAGAGGGC CAACTGCAGA CGGCAGGCTT AAGCCTGAGG TTGTTGCTCC AGGAAACTGG 660 

40 ATAATTGCTG CCAGAGCAAG TGGAACTAGC ATGGGTCAAC CAATTAATGA CTATTACACA 720 

GCAGCTCCTG GGACATCAAT GGCAACTCCT CACGTAGCTG GTATTGCAGC CCTCTTGCTC 780 
CAAGCACACC CGAGCTGGAC TCCAGACAAA GTAAAAACAG CCCTCATAGA AACTGCTGAT 840 
ATCGTAAAGC CAGATGAAAT AGCCGATATA GCCTACGGTG CAGGTAGGGT TAATGCATAC 900 
AAGGCTATAA ACTACGATAA CTATGCAAAG CTAGTGTTCA CTGGATATGT TGCCAACAAA 960 
GGCAGCCAAA CTCACCAGTT CGTTATTAGC GGAGCTTCGT TCGTAACTGC CACATTATAC 1020 
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TGGGACAATG CCAATAGCGA CCTTGATCTT TACCTCTACG ATCCCAATGG AAACCAGGTT 1080 

GACTACTCTT ACACCGCCTA CTATGGATTC GAAAAGGTTG GTTATTACAA CCCAACTGAT 1140 

GGAACATGGA CAATTAAGGT TGTAAGCTAC AGCGGAAGTG CAAACTATCA AGTAGATGTG 1200 

GTAAGTGATG GTTCCCTTTC ACAGCCTGGA AGTTCACCAT CTCCACAACC AGAACCAACA 1260 

GTAGACGCAA AGACGTTCCA AGNATCCGAT CACTACTACT AT GACAGGAG CGACACCTTT 1320 

ACAATGACCG TTAACTCTGG GGCTACAAAG ATTACTGGAG ACCTAGTGTT TGACACAAGC 1380 

TACCATGATC TTGACCTTTA CCTCTACGAT CCTAACCAGA AGCTTGTAGA TAGATCGGAG 1440 

AGTCCCAACA GCTACGAACA CGTAGAATAC TTAACCCCCG CCCCAGGAAC CTGGTACTTC 1500 

CTAGTATATG CCTACTACAC TTACGGTTGG GCTTACTACG AGGTGACGGC TAAAGTTTAT 1560 

TATGGC - 1566 



(2) INFORMATION FOR* SEQ ID NO: 5 

(i) SEQUENCE CHARACTERISTICS 
15 (A) LENGTH: 659 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Met Lys Gly Leu Lys Ala Leu lie Leu Val lie Leu Val Leu Gly 

5 10 15 

Leu Val Val Gly Ser Val Ala Ala Ala Pro Glu Lys Lys Val Glu 
20 25 30 

Gin Val Arg Asn Val Glu Lys Asn Tyr Gly Leu Leu Thr Pro Gly 
25 35 40 45 

Leu Phe Arg Lys lie Gin Lys Leu Asn Pro Asn Glu Glu lie Ser 
50 . 55 60 

Thr Val lie Val Phe Glu Asn His Arg Glu Lys Glu lie Ala Val 
65 70 75 

30 Arg Val Leu Glu Leu Met Gly Ala Lys Val Arg Tyr Val Tyr His 

80 85 " 90 

lie lie Pro Ala lie Ala Ala Asp Leu Lys Val Arg Asp Leu Leu 
95 100 105 

Val lie Ser Gly Leu Thr Gly Gly Lys Ala Lys Leu Ser Gly Val 
110 115 120 

Arg Phe lie Gin Glu Asp Tyr Lys Val Thr Val Ser Ala Glu Leu 
125 130 135 

Glu Gly Leu Asp Glu Ser Ala Ala Gin- Val Met Ala Thr Tyr Val 
140 145 150 

Trp Asn Leu Gly Tyr Asp Gly Ser Gly lie Thr lie Gly lie lie 
40 155 160 165 

Asp Thr Gly lie Asp Ala Ser His Pro Asp Leu Gin Gly Lys Val 
170 175 180 

lie Gly Trp Val Asp Phe Val Asn Gly Arg Ser Tyr Pro Tyr Asp 
185 190 195 

Asp His Gly His Gly Thr His Val Ala Ser He Ala Ala Gly Thr 



50 



55 



36 



EP0870 833A1 



Gly Ala Ala 
Lys Leu Ala 
lie Ser Thr 
Asp Lys Tyr 
Gin Ser Ser 
Ala Trp Asp 
Gly Pro Asn 
Val lie Thr 
Phe Ser Ser 
Val Val Ala 
Thr Ser Met 
Gly Thr Ser 
lie Leu Gin 
Ala Leu lie 
Asp lie Ala 
Lys Tyr Asp 
Asp Lys Gly 
Phe Val Thr 
Asp Leu Tyr 
Tyr Thr Ala 
Thr Ala Gly. 
Ala Asn Tyr 
Ser Gly Gly 



200 
Ser Asn Gly 

215 
Gly lie Lys 

230 
lie lie Lys 

245 
Gly lie Lys 

260 
Asp Gly Thr 

275 
Ala Gly He 

290' 
Thr Tyr Thr 

305 
Val Gly Ala 

320 
Arg Gly Pro 

335 
Pro Gly Val 

350 
Gly Thr Pro 

365 
Met Ala Thr 

380 
Ala His Pro 

395 
Glu Thr Ala 

410 
Tyr Gly Ala 

425 
Asp Tyr Ala 

440 
Ser Ala Thr 

455 
Ala Thr Leu 

470 
Leu Tyr Asp 

485 
Tyr Tyr Gly 

500 
Thr Trp Thr 

515 
Gin Val Asp 

530 
Gly Asn Pro 



Lys Tyr 
Val Leu 
Gly Val 
Val lie 
Asp Ser 
Val Val 
Val Gly 
Val Asp 
Thr Ala 
Asp He 
He Asn 
Pro His 
Ser Trp 
Asp He 
Gly Arg 
Lys Leu 
His Thr 
Tyr Trp 
Pro Asn 
Phe Glu 
Val Lys 
Val Val 
Asn Pro 



205 
Lys Gly 

220 
Gly Ala 

235 
Glu Trp 

250 
Asn Leu 

265 
Leu Ser 

280 
Cys Val 

295 
Ser Pro 

310 
Ser Asn 

325 
Asp Gly 

340 
He Ala 

355 
Asp Tyr 

370 
Val Ser 

385 
Thr Pro 

400 
Val Ala 

415 
Val Asn 

430 
Thr Phe 

445 
Phe Asp 

460 
Asp Thr 

475 
Gly Asn 

490 
Lys Val 

505 
Val Val 

520 
Ser Asp 

535 
Asn Pro 



Met Ala 
Asp Gly 
Ala Val 
Ser Leu 
Gin Ala 
Ala Ala 
Ala Ala 
Asp Asn 
Arg Leu 
Pro Arg 
Tyr Thr 
Gly Val 
Asp Lys 
Pro Lys 
Val Tyr 
Thr Gly 
Val Ser 
Gly Ser 
Glu Val 
Gly Tyr 
Ser Tyr 
Gly Ser 
Asn Pro 



210 

Pro Gly Ala 
225 

Ser Gly Ser 
240 

Asp Asn Lys 
255 

Gly Ser Ser 
270 

Val Asn Asn 
285 

Gly Asn Ser 
300 

Ala Ser Lys 
315 

He Ala Ser 
330 

Lys Pro Glu 
345 

Ala Ser Gly 
360 

Lys Ala Ser 
375 

Gly Ala Leu 
390 

Val Lys Thr 
405 

Glu lie Ala 
420 

Lys Ala He 
435 

Ser Val Ala 
450 

Gly Ala Thr 
465 

Ser Asp He 
480 

Asp Tyr Ser 
495 

Tyr Asn Pro 
510 

Lys Gly Ala 
525 

Leu Ser Gin 
540 

Asn Pro Thr 



37 



EP 0 870 833 A1 



545 550 - 555 

Pro Thr Thr Asp Thr Gin Thr Phe Thr Gly Ser Val Asn Asp Tyr 
560 565 570 

b Trp Asp Thr Ser Asp Thr Phe Thr Met Asn Val Asn Ser Gly Ala 

575 . 580 685 

Thr Lys lie Thr Gly Asp Leu Thr Phe Asp Thr Ser Tyr Asn Asp 
590 595 600 

Leu Asp Leu Tyr Leu Tyr Asp Pro Asn Gly Asn Leu Val Asp Arg 
10 605 610 615 

Ser Thr Ser Ser Asn Ser Tyr Glu .His Val Glu Tyr Ala Asn Pro 
620 625 630 

Ala Pro Gly Thr Trp Thr Phe Leu Val Tyr Ala Tyr Ser Thr Tyr 
635 640 645 

15 Gly Trp Ala Asp Tyr Gin Leu Lys Ala Val Val Tyr Tyr Gly 

650 655 

(2) INFORMATION FOR SEQ ID NO: 6 

(i) SEQUENCE CHARACTERISTICS 
(A) LENGTH: 1977 

20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

25 ATGAAGGGGC TGAAAGCTCT CATATTAGTG ATTTTAGTTC TAGGTTTGGT AGTAGGGAGC 60 

GTAGCGGCAG CTCCAGAGAA GAAAGTTGAA CAAGTAAGAA ATGTTGAGAA GAACTATGGT 120 

CTGCTAACGC CAGGACTGTT CAGAAAAATT CAAAAATTGA ATCCTAACGA GGAAATCAGC 180 

ACAGTAATTG TAT T TGAAAA CCATAGGGAA AAAGAAATTG CAGTAAGAGT TCTTGAGTTA 240 

ATGGGTGCAA AAGT TAGGTA TGTGTACCAT ATTATACCCG CAATAGCTGC CGATCTTAAG 300 

GTTAGAGACT TACTAGTCAT CTCAGGTTTA ACAGGGGGTA AAGCTAAGCT TTCAGGTGTT 360 

AGGTTTATCC AGGAAGACTA CAAAGTTACA GTTTCAGCAG AATTAGAAGG ACTGGATGAG 420 

TCTGCAGCTC AAGTTATGGC AACTTACGTT TGGAACTTGG GATATGATGG TTCTGGAATC 480 

ACAATAGGAA TAATTGACAC TGGAATTGAC GCTTCTCATC CAGATCTCCA AGGAAAAGTA 540 

ATTGGGTGGG TAGATTTTGT CAATGGTAGG AGTTATCCAT ACGATGACCA TGGACATGGA 600 

ACTCATGTAG CTTCAATAGC AGCTGGTACT GGAGCAGCAA GTAATGGCAA GTACAAGGGA 660 

35 ATGGCTCCAG GAGCTAAGCT GGCGGGAATT AAGGTTCTAG GTGCCGATGG TTCTGGAAGC 720 

ATATCTACTA TAATTAAGGG AGTTGAGTGG GCCGTTGATA ACAAAGATAA GTACGGAATT 780 

AAGGTCATTA ATCTTTCTCT TGGTTCAAGC CAGAGCTCCG ACGGAACCGA CTCCCTCAGT 840 

CAGGCCGTCA ACAACGCCTG GGACGCCGGT ATAGTAGTCT GCGTCGCCGC CGGCAACAGC 900 

GGGCCGAACA CCTACACCGT CGGCTCACCC GCCGCCGCGA GCAAGGTCAT AACCGTCGGT 960 

40 GCAGTTGACA GCAACGACAA CATCGCCAGC TTCTCCAGCA GGGGACCGAC CGCGGACGGA 1020 

AGGCTCAAGC CGGAAGTCGT CGCCCCCGGC GTTGACATCA TAGCCCCGCG CGCCAGCGGA 1080 

ACCAGCATGG GCACCCCGAT AAACGACTAC TACACCAAGG CCTCTGGAAC CAGCATGGCC 1140 

ACCCCGCACG TTTCGGGCGT TGGCGCGCTC ATCCTCCAGG CCCACCCGAG CTGGACCCCG 1200 

GACAAGGTGA AGACCGCCCT CATCGAGACC GCCGACATAG TCGCCCCCAA GGAGATAGCG 1260 

45 GACATCGCCT ACGGTGCGGG TAGGGTGAAC GTCTACAAGG CCATCAAGTA CGACGACTAC 1320 
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GCCAAGCTCA CCTTCACCGG CTCCGTCGCC GACAAGGGAA GCGCCACCCA CACCTTCGAC 1380 

GTCAGCGGCG CCACCTTCGT GACCGCCACC CTCTACTGGG ACACGGGCTC GAGCGACATC 1440 

GACCTCTACC TCTACGACCC CAACGGGAAC GAGGTTGACT ACTCCTACAC CGCCTACTAC 1500 

4 GGCTTCGAGA AGGTCGGCTA CTACAACCCG ACCGCCGGAA CCTGGACGGT CAAGGTCGTC 1560 

AGCTACAAGG GCGCGGCGAA CTACCAGGTC GACGTCGTCA GCGACGGGAG CCTCAGCCAG 1620 

TCCGGCGGCG GCAACCCGAA TCCAAACCCC AACCCGAACC CAACCCCGAC CACCGACACC 1680 

CAGACCTTCA CCGGTTCCGT TAACGACTAC TGGGACACCA GCGACACCTT CACCATGAAC 1740 

GTCAACAGCG GTGCCACCAA GATAACCGGT GACCTGACCT TCGATACTTC CTACAACGAC 1800 

10 CTCGACCTCT ACCTCTACGA CCCCAACGGC AACCTCGTTG ACAGGTCCAC GTCGAGCAAC 1860 

AGCTACGAGC ACGTCGAGTA CGCCAACCCC GCCCCGGGAA CCTGGACGTT CCTCGTCTAC 1920 

GCCTACAGCA CCTACGGCTG GGCGGACTAC CAGCTCAAGG CCGTCGTCTA CTACGGG 1977 



15 



(2) INFORMATION FOR SEQ ID NO: 7 
fi) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 4765 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: genomic DNA 
20 (vi) ORIGINAL SOURCE: Pyrococcus furiosus 

(B) STRAIN: DSM3638 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

TTTAAATTAT AAGATATAAT CACTCCGAGT GATGAGTAAG ATACATCATT ACAGTCCCAA 60 

AATGTTTATA ATTGGAACGC AGTGAATATA CAAAATGAAT ATAACCTCGG AGGTGACTGT 120 

AGAATGAATA AGAAGGGACT TACTGTGCTA TTTATAGCGA TAATGCTCCT TTCAGTAGTT 180 

25 CCAGTGCACT TTGTGTCCGC AGAAACACCA CCGGTTAGTT CAGAAAATTC AACAACTTCT 240 

ATACTCCCTA ACCAACAAGT TGTGACAAAA GAAGTTTCAC AAGCGGCGCT TAATGCTATA 300 

ATGAAAGGAC AACCCAACAT GGTTCTTATA ATCAAGAC T A AGGAAGGCAA ACTTGAAGAG '360 

GCAAAAACCG AGCTTGAAAA GCTAGGTGCA GAGATTCTTG ACGAAAATAG AGTTCTTAAC 420 

ATGTTGCTAG TTAAGATTAA GCCTGAGAAA GTTAAAGAGC TCAACTATAT CTCATCTCTT 480 

30 GAAAAAGCCT GGCTTAACAG AGAAGTTAAG CTTTCCCCTC CAATTGTCGA AAAGGACGTC . 540 

AAGACTAAGG AGCCCTCCCT AGAACCAAAA ATGTATAACA GCACCTGGGT AATTAATGCT 600 

CTCCAGTTCA TCCAGGAATT TGGATATGAT GGTAGTGGTG TTGTTGTTGC AGTACTTGAC 660 

ACGGGAGTTG ATCCGAACCA TCCTTTCTTG AGCATAACTC CAGATGGACG CAGGAAAATT 720 

ATAGAATGGA AGGATTTTAC AGACGAGGGA TTCGTGGATA CATCATTCAG CTTTAGCAAG 780 
GTTGTAAATG GGACTCTTAT AATTAACACA ACATTCCAAG TGGCCTCAGG TCTCACGCTG 840 
AATGAATCGA CAGGACTTAT GGAATACGTT GTTAAGACTG TTTACGTGAG CAATGTGACC 900 
ATTGGAAATA TCACTTCTGC TAATGGCATC TATCACTTCG GCCTGCTCCC AGAAAGATAC 960 

TTCGACTTAA ACTTCGATGG TGATCAAGAG GACTTCTATC CTGTCTTATT AGTTAACTCC 1020 

ACTGGCAATG GTTATGACAT TGCATATGTG GATACTGACC TTGACTACGA CTTCACCGAC 1080 

GAAGTTCCAC TTGGCCAGTA CAACGTTACT TATGATGTTG CTGTTTTTAG CTACTACTAC 1140 

40 GGTCCTCTCA ACTACGTGCT TGCAGAAATA GATCCTAACG GAGAATATGC AGTATTTGGG 1200 

TGGGATGGTC ACGGTCACGG AACTCACGTA GCTGGAACTG TTGCTGGTTA CGACAGCAAC 1260 

AATGATGCTT GGGATTGGCT CAGTATGTAC TCTGGTGAAT GGGAAGTGTT CTCAAGACTC 1320 

TATGGTTGGG AT TATACGAA CGTTACCACA GACACCGTGC AGGGTGTTGC TCCAGGTGCC 1380 

CAAATAATGG CAATAAGAGT TCTTAGGAGT GATGGACGGG GTAGCATGTG GGATATTATA 1440 

45 
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GAAGGTATGA CATACGCAGC AACCCATGGT GCAGACGTTA TAAGCATGAG TCTCGGTGGA 1500 

AATGCTCCAT ACTTAGATGG TACTGATCCA GAAAGCGTTG CTGTGGATGA GCTTACCGAA 1560 

AAGTACGGTG TTGTATTCGT AATAGCTGCA GGAAATGAAG GTCCTGGCAT TAACATCGTT 1620 

GGAAGTCCTG GTGTTGCAAC AAAGGCAATA ACTGTTGGAG CTGCTGCAGT GCCCATTAAC 1680 

GTTGGAGTTT ATGTTTCCCA AGCACTTGGA TATCCTGATT ACTATGGATT CTATTACTTC 1740 

CCCGCCTACA CAAACGTTAG AATAGCATTC TTCTCAAGCA GAGGGCCGAG AATAGATGGT 1800 

GAAATAAAAC CCAATGTAGT GGCTCCAGGT TACGGAATTT ACTCATCCCT GCCGATGTGG 1860 

ATTGGCGGAG CTGACTTCAT GTCTGGAACT TCGATGGCTA CTCCACATGT CAGCGGTGTC 1920 

GTTGCACTCC TCATAAGCGG GGCAAAGGCC GAGGGAATAT ACTACAATCC AGATATAATT 1980 

AAGAAGGTTC TTGAGAGCGG TGCAACCTGG CTTGAGGGAG ATCCATATAC TGGGCAGftAG 2040 

TACACTGAGC TTGACCAAGG TCATGGTCTT GTTAACGTTA CCAAGTCCTG GGAAATCCTT 2100 

AAGGCTATAA ACGGCACCAC TCTCCCAATT GTTGATCACT GGGCAGACAA GTCCTACAGC 2160 

GACTTTGCGG AGTACTTGGG TGTGGACGTT ATAAGAGGTC TCTACGCAAG GAACTCTATA 2220 

CCTGACATTG TCGAGTGGCA CATTAAGTAC GTAGGGGACA CGGAGTACAG AACTTTTGAG 2280 

ATCTATGCAA CTGAGCCATG GATTAAGCCT TTTGTCAGTG GAAGTGTAAT TCTAGAGAAC 2340 

AATACCGAGT TTGTCCTTAG GGTGAAATAT GATGTAGAGG GTCTTGAGCC AGGTCTCTAT 2400 

GTTGGAAGGA TAATCATTGA TGATCCAACA ACGCCAGTTA TTGAAGACGA GATCTTGAAC 2460 

ACAATTGTTA TTCCCGAGAA GTTCACTCCT GAGAACAATT ACACCCTCAC CTGGTATGAT 2520 

ATTAATGGTC CAGAAATGGT GACTCACCAC TTCTTCACTG TGCCTGAGGG AGTGGACGTT 2580 

CTCTACGCGA TGACCACATA CTGGGACTAC GGTCTGTACA GACCAGATGG AATGTTTGTG 2640 

TTCCCATACC AGCTAGATTA TCTTCCCGCT GCAGTCTCAA ATCCAATGCC TGGAAACTGG 2700 

GAGCTAGTAT GGACTGGATT TAACTTTGCA CCCCTCTATG AGTCGGGCTT CCTTGTAAGG 2760 

ATTTACGGAG TAGAGATAAC TCCAAGCGTT TGGTACATTA ACAGGACATA CCTTGACACT 2820 

AACACTGAAT TCTCAATTGA ATTCAATATT ACTAACATCT ATGCCCCAAT TAATGCAACT 2880 

CTAATCCCCA TTGGCCTTGG AACCTACAAT GCGAGCGTTG AAAGCGTTGG TGATGGAGAG 2940 

TTCTTCATAA AGGGCATTGA AGTTCCTGAA GGCACCGCAG AGTTGAAGAT TAGGATAGGC 3000 

AACCCAAGTG TTCCGAATTC AGATCTAGAC TTGTACCTTT ATGACAGTAA AGGCAATTTA 3060 

GTGGCCTTAG ATGGAAACCC AACAGCAGAA GAAGAGGTTG TAGTTGAGTA TCCTAAGCCT 3120 

GGAGTTTATT CAATAGTAGT ACATGGTTAC AGCGTCAGGG ACGAAAATGG TAATCCAACG 3180 

ACAACCACCT TTGACTTAGT TGTTCAAATG ACCCTTGATA ATGGAAACAT AAAGCTTGAC 3240 
AAAGACTCGA TTATTCTTGG AAGCAATGAA AGCGTAGTTG TAACTGCAAA CATAACAATT . 3300 

GATAGAGATC ATCCTACAGG AGTATACTCT GGTATCATAG AGATTAGAGA TAATGAGGTC 3360 

TACCAGGATA CAAATACTTC AATTGCGAAA ATACCCATAA CTTTGGTAAT TGACAAGGCG 3420 

GACTTTGCCG TTGGTCTCAC ACCAGCAGAG GGAGTACTTG GAGAGGCTAG AAATTACACT 3480 

CTAATTGTAA AGCATGCCCT AACACTAGAG CCTGTGCCAA ATGCTACAGT GATTATAGGA 3540 

AACTACACCT ACCTCACAGA CGAAAACGGT ACAGTGACAT TCACGTATGC TCCAACTAAG 3600 

TTAGGCAGTG ATGAAATCAC AGTCATAGTT AAGAAAGAGA ACTTCAACAC ATTAGAGAAG 3660 

ACCTTCCAAA TCACAGTATC AGAGCCTGAA ATAACTGAAG AGGACATAAA TGAGCCCAAG 3720 

CTTGCAATGT CATCACCAGA AGCAAATGCT ACCATAGTAT CAGTTGAGAT GGAGAGTGAG 3780 

GGTGGCGTTA AAAAGACAGT GACAGTGGAA ATAACTATAA ACGGAACCGC TAATGAGACT 3840 

GCAACAATAG TGGTTCCTGT TCCTAAGAAG GCCGAAAACA TCGAGGTAAG TGGAGACCAC 3900 

GTAATTTCCT ATAGTATAGA GGAAGGAGAG TACGCCAAGT ACGTTATAAT TACAGTGAAG 3960 

TTTGCATCAC CTGTAACAGT AACTGTTACT TACACTATCT ATGCTGGCCC AAGAGTCTCA 4020 

ATCTTGACAC TTAACTTCCT TGGCTACTCA TGGTACAGAC TATATTCACA GAAGTTTGAC 4080 

GAATTGTACC AAAAGGCCCT TGAATTGGGA GTGGACAACG AGACATTAGC TTTAGCCCTC 4140 

AGCTACCATG AAAAAGCCAA AGAGTACTAC GAAT^AGGCCC TTGAGCTTAG CGAGGGTAAC 4200 
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25 



35 



40 



ATAATCCAAT ACCTTGGAGA CATAAGACTA TTACCTCCAT TAAGACAGGC ATACATCAAT 4260 

GAAATGAAGG CAGTTAAGAT ACTGGAAAAG GCCATAGAAG AATTAGAGGG TGAAGAGTAA 4320 

TCTCCAATTT TTCCCACTTT TTCTTTTATA ACATTCCAAG CCTTTTCTTA GCTTCTTCGC 4380 

TCATTCTATC AGGAGTCCAT GGAGGATCAA AGGTAAGTTC AACCTCCACA TCTCTTACTC 4440 

CTGGGATTTC GAGTACTTTC TCCTCTACAG CTCTAAGAAG CCAGAGAGTT AAAGGACACC 4500 

CAGGAGTTGT CATTGTCATC TTTATATATA CCGTTTTGTC AGGATTAATC TTTAGCTCAT 4560 

AAATTAATCC AAGGTTTACA ACATCCATCC CAATTTCTGG GTCGATAACC TCCTTTAGCT 4620 

TTTCCAGAAT CATTTCTTCA GTAATTTCAA GGTTCTCATC TTTGGTTTCT CTCACAAACC 4 680 

CAATTTCAAC CTGCCTGATA CCTTCTAACT CCCTAAGCTT GTTATATATC . TCCAAAAGAG .4740 

TGGCATCATC AATTTTCTCT TTAAA 3 f 4765 



(2) INFORMATION FOR SEQ ID NO: 8 

(i) SEQUENCE CHARACTERISTICS 
15 (A) LENGTH: 1398 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Met Asn Lys Lys Gly Leu Thr Val Leu Phe lie Ala lie Met Leu 

5 10 15 

Leu Ser Val Val Pro Val His Phe Val Ser Ala Glu Thr Pro Pro 
20 25 30 

Val Ser Ser Glu Asn Ser Thr Thr Ser lie Leu Pro Asn Gin Gin 
35 40 45 

Val Val Thr Lys Glu Val Ser Gin Ala Ala Leu Asn Ala He Met 
50 55 60 

Lys Gly Gin Pro Asn Met Val Leu He He Lys Thr Lys Glu Gly 
65 70 75 

.30. Lys Leu Glu Glu Ala Lys Thr Glu Leu Glu Lys Leu . Gly Ala Glu 

80 85 90 

He Leu Asp Glu Asn Arg Val Leu Asn Met Leu Leu Val Lys He 
95 100 105 

Lys Pro Glu Lys Val Lys Glu Leu Asn Tyr lie Ser Ser Leu Glu 
110 115 120 

Lys Ala Trp Leu Asn Arg Glu Val Lys Leu Ser Pro Pro He Val 
125 130 135 

Glu Lys Asp Val Lys Thr Lys Glu Pro Ser Leu Glu Pro Lys Met 
140 145 150 

Tyr Asn Ser Thr Trp Val He Asn Ala Leu Gin Phe He Gin Glu 
155 160 165 

Phe Gly Tyr Asp Gly Ser Gly Val Val Val Ala Val Leu Asp Thr 
170: 175 180 

Gly Val Asp Pro Asn His Pro Phe Leu Ser He Thr Pro Asp Gly 
185 190 195 

45 . Arg Arg Lys lie He Glu Trp Lys Asp Phe Thr Asp Glu Gly Phe 



so 



55 



41 



EP0 870 833A1 



200 






205 










210 

mm> -L \J 


Val Asp Thr Ser Phe Ser Phe 


Ser 


Lys 


Val 


Val 


Asn 


Gly 


Thr 


Leu 


215 






220 








225 


lie He Asn Thr Thr Phe Gin 


Val 


Ala 


Ser 


Gly 


Leu 


Thr 


Leu 


Asn 


230 






235 








240 


Glu Ser Thr Gly Leu Met Glu 


Tyr 


Val 


Val 


Lvs 


Thr 


Val 


Tvr 


Val 


245 






250 








255 


Ser Asn Val Thr He Gly Asn 


He 


Thr 


S er 


Ala 


Asn 


Glv 


lie 


Tvr 


260 






265 










His Phe Gly Leu Leu Pro Glu 


Arg 


Tvr 


Phe 


Asp 


Leu 


Asn 


Phe 


Asp 


275 






280 










285 


Gly Asp Gin Glu Asp Phe Tyr 


Pro 


Val 


Leu 


Leu 


Val 


Asn 


Ser 


Thr 


290 






295 










300 


Gly Asn Gly Tyr Asp He Ala 


Tyr 


Val 


Asp 


Thr 


Asp 


Leu 


Asp 


Tvr 


305 






310 








315 


Asp Phe Thr Asp Glu Val Pro 


Leu 


Gly 


Gin 


Tyr 


Asn 


Val 


Thr 


Tvr 


320 






325 












Asp Val Ala Val Phe Ser Tyr 


Tvr 


Tvr 


Glv 

J: 


Pro 


Leu 


Asn 


Tyr 


Val 

V CA_L 


335 






340 








.3 H O 


Leu Ala Glu He Asp Pro Asn 


Gly 


Glu 


Tyr 


Ala 


Val 


Phe 


Gly 




350 






355 








360 


Asp Gly His Gly His Gly Thr 


His 


Val 


Ala 


Gly 


Thr 


Val 


Ala 


Glv 


365 






370 








375 


Tyr Asp Ser Asn Asn Asp Ala 


Trp 


Asp 


Trp 


Leu 


Ser 


Met 


Tvr 


Ser 


380 






385 








390 


Gly Glu Trp Glu Val Phe Ser 


Arg 


Leu 


Tyr 


Gly 


Trp 


Asp 


Tyr 


Thr 


.395. 






400 






405 


Asn Val Thr Thr Asp Thr Val 


Gin 


Gly 


Val 


Ala 


Pro 


Gly 


Ala 


Gin 

w Jill 


410 






415 








420 


He Met Ala He Arg Val Leu 


Arg 


Ser 


Asp 


Glv 


Ara 


Glv 


Ser 


Met 


425 






430 






435 


Trp Asp He He Glu Gly Met 


Thr 


Tyr 


Ala 


Ala 


Thr 


His 


Gly 


Ala 


440 






445 








450 


Asp Val He Ser Met Ser Leu 


Gly 


Gly 


Asn 


Ala 


Pro 


Tvr 


Leu 


Asp 


455 






4 60 








465 


Gly Thr Asp Pro Glu Ser Val 


Ala 


Val 


Asp 


Glu 


Leu 


Thr 


Glu 


Lys 


470 






475 










480 


Tyr Gly Val Val Phe Val He 


Ala 


Ala 


Gly 


Asn 


Glu 


Glv 


Pro 


Glv 


485 






490 








495 


He Asn He Val Gly Ser Pro 


Gly 


Val 


Ala 


Thr 


Lys 


Ala 


He 


Thr 


500 . 






505 








510 


Val Gly Ala Ala Ala Val Pro 


He 


Asn 


Val 


Gly 


Val 


Tyr 


Val 


Ser 


515 






520 






525 


Gin Ala Leu Gly Tyr Pro Asp 


Tyr 


Tyr 


Gly 


Phe 


Tyr 


Tyr 


Phe 


Pro 


530 






535 










540 


Ala Tyr Thr Asn Val Arg He 


Ala 


Phe 


Phe 


Ser 


Ser 


Arg 


Gly 


Pro 
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545 




550 










Arg lie Asp Gly Glu 


He Lys Pro Asn 


Val Val 


Ala 


Pro 


Gly 




560 




565 






570 


Gly lie Tyr Ser Ser 


Leu Pro Met Trp 


He Gly 


Glv 


Ala 


Asp 


Phe 


575 




580 




JO J 


Met Ser Gly Thr Ser 


Met Ala Thr Pro 


His Val 


Ser 


Glv 


Val 

v ex i 


Val 

v ax 


590 




595 






enn 
Ouu 


Ala Leu Leu lie Ser 


Glv Ala Lvs Ala 


Glu Glv 




Tyr 


xyr 


7\ en 


605 




610 
\j i \j 




DIO 


Pro Asp lie lie Lys 


Lvs Val Leu Glu 


Cor Glv 


Ala 


Thr 


Tr-ri 

irp 


T Q1 1 


620 




625 

V C 






OjU 


Glu Gly Asp Pro Tyr 


Thr Glv Gin Lvs 


Tvr* Thr 


Glu 

VJX U 


T.pn 




G*l r» 


635 




640 






Dfi O 


Gly His Gly Leu Val 


Asn Val Thr Lvs 


Ser Trp 


Glu 

V3JLUL 


T1 p 
lie 


T.cai i 
JueLL 


xiyo 


650 




655 








ODu 


Ala lie Asn Gly Thr 


Thr Leu Pro He 


Val Asd 


His 


Trp 


Ala 




665 




670 






o / *J 


Lys Ser Tyr Ser Asp 


Phe Ala Glu Tyr 


Leu Glv 


Val 

v ax 




Val 

v a i 


Tl P 


680 




685 








Arg Gly Leu Tyr Ala 


Ara Asn Ser He 


Pro Asn 


He 


Val 

V CXI. 


Glu 




695 




700 










His lie Lys Tyr Val 


Glv Asd Thr Glu 


i. y x. ill y 


Thr 

A 111 


Php 
rue 


Gl n 

VJX u 


Tl p 
iic 


710 




715 








7?0 


Tyr Ala Thr Glu Pro 


Trp He Lys Pro 


Phe Val 


Ser 


Gly 


Spr* 


Val 

V ox 


725 




730 






7^ 


lie Leu Glu Asn Asn 


Thr Glu .Phe Val 


Leu Arrr 


Val 








740 




74^ 










Val Glu Gly Leu Glu 


Pro Gly Leu Tyr 


Val Gly 


-*TX y 


Tie 

IXC 


Tie 

IXC 


T1p 

IXC 


755 




760 






765 


Asp Asp Pro Thr Thr 


Pro Val He Glu 


Asn Glu 


He 


Leu 


Asn 


Thr 


770 




775 








780 


He Val He Pro Glu 


Lys Phe Thr Pro 


Glu Asn 


Asn 


Tvr 

1 jYX 


Thr 


Leu 


785 




790 






795 


Thr Trp Tyr Asp He 


Asn Gly Pro Glu 


Met Val 


Thr 


His 


His 


Phe 


800 




805 








810 


Phe Thr Val Pro Glu 


Gly Val Asp Val 


Leu Tyr 


Ala 


Met 


Thr 


Thr 


815 




820 










Tyr Trp Asp Tyr Gly 


Leu Tyr Arg Pro 


Asp Gly 


Met 


Phe 


Val 


Phe 


830 




835 








840 


Pro Tyr Gin Leu Asp 


Tyr Leu Pro Ala 


Ala Val 


Ser 


Asn 


Pro 


Met 


845 




850 








855 


..Pro Gly Asn Trp Glu 


Leu Val Trp Thr 


Gly Phe 


Asn 


Phe 


Ala: 


Pro 


860 




865 








870 


Leu Tyr Glu Ser Gly 


Phe Leu Val Arg 


lie Tyr 


Gly 


Val 


Glu 


He 


875 




880 _ 






885 


Thr Pro Ser Val Trp 


Tyr He Asn Arg 


Thr ^yr 


Leu 


Asp 


Thr 


Asn 
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895 








900 


Tnr Glu Pne 


Ser He 


Glu Phe Asn 


He Thr 


Asn 


He 


Tyr 


Ala Pro 




9Uo 




910 






915 


He Asn Ala 


Thr Leu 


lie Pro He 


Gly Leu 


Gly 


Thr 


Tyr 


Asn Ala 




yzu 




925 




930 


oer vai Caiu 


Ser val 


Gly Asp Gly 


Glu Phe 


Phe 


lie 


Lys 


Gly He 








940 






945 


bill Vai AriTO 


blU Giy 


inr Ala Glu 


Leu Lys 


lie 


Arg 


. lie 


Gly Asn 








955 








960 


rXO oei vai 


Pro Asn 


oer Asp Leu 


Asp Leu 


Tyr 


Leu 


Tyr 


Asp Ser 




965 




970 






975 


Lys Gly Asn 


Leu Val 


Ala Leu Asp 


Gly Asn 


Pro 


Thr 


Ala 


Glu Glu 




980 




985 








990 


Glu Vai Vai 


Val Glu 


Tyr Pro Lys 


Pro Gly 


Val 


Tyr 


Ser 


lie Val 




995 




1000 








1005 


vai his Gly 


Tyr Ser Val Arg Asp 


Glu Asn 


Gly 


Asn 


Pro 


Thr Thr 


1 111 1 nx Irfie 


1010 




1015 






1020 


Asp Leu 


Val Val Gin 


Met Thr 


Leu 


Asp 


Asn 


Gly Asn 




1025 




1030 






1035 


He Lys Leu 


Asp Lys Asp Ser lie 


He Leu 


Gly 


Ser 


Asn 


Glu Ser 




1040 




1045 






1050 


vai vai Val 


Thr Ala 


Asn He Thr 


He Asp 


Arg 


Asp 


His 


Pro Thr 




1055 




1060 






1065 


Gly val Tyr 


Ser Gly 


He He Glu 


He Arg 


Asp 


Asn 


Glu 


Val Tyr 


win Asp inr 


1070 




1075 








1080 


Asn Thr 


Ser lie Ala 


Lys lie 


Pro 


lie 


Thr 


Leu Val 




1085 




1090 








1095 


iic Asp itys 


Ala Asp 


Phe Ala Val 


Gly Leu 


Thr 


Pro 


Ala 


Glu Gly 




1100 




1105 








1110 


vai jueu Giy 


Glu Ala Arg Asn Tyr 


Thr Leu 


He 


Val 


Lys 


His Ala 




1115 




1120 






1125 


lieu Tnr lieu 


Glu Pro 


Val Pro Asn 


Ala Thr 


Val 


lie 


lie 


Gly Asn 




1130 




1135 








1140 


i yr Trir Tyr 


Leu Thr Asp Glu Asn 


Gly Thr 


Val 


Thr 


Phe 


Thr Tyr 




1145 




1150 








1155 


Ala Pro Thr 


Lys Leu 


Gly Ser Asp 


Glu lie 


Thr 


Val 


He 


Val Lys 




1160 




1165 








1170 


Juyo bill rxoli 


Phe Asn 


Thr Leu Glu 


Lys Thr 


Phe 


Gin 


lie 


Thr Val 




1175 




lloO 








1185 


Oei VjIU trlO 


Glu He 


Thr Glu Glu 


Asp lie 


Asn 


Glu 


Pro 


Lys Leu 


Ala Met Ser 


1190 




1195 








1200 


Ser Pro 


Glu Ala Asn 


Ala Thr. 


He 


Val 


Ser 


Val Glu 




1205 




1210 








1215 


Met Glu Ser 


Glu Gly Gly Val Lys 


Lys Thr 


Val 


Thr 


Val 


Glu He 




1220 




1225 








1230 


Thr He Asn 


Gly Thr Ala Asn Glu 


Thr Ala 


Thr 


lie 


Val 


Val Pro 
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1235 






1240 








1245 


Val Pro Lys 


Lys Ala 


Glu Asn 


He 


Glu Val 


Ser 


Gly 


Asp 


His Val 




1250 






1255 






1260 


lie Ser Tyr 


Ser He 


Glu Glu 


Gly 


Glu Tyr 


Ala 


Lys 


Tyr 


Val He 




1265 






1270 








1275 


He Thr Val 


Lys Phe 


Ala Ser 


Pro 


Val Thr 


Val 


Thr 


Val 


Thr Tyr 




1280 






1285 








1290 


Thr He Tyr 


Ala Gly 


Pro Arg 


Val 


Ser He 


Leu 


Thr 


Leu 


Asn Phe 




1295 






1300 








1305 


Leu Gly Tyr 


Ser Trp 


Tyr Arg 


Leu 


Tyr Ser 


Gin 


Lys 


Phe 


Asp Glu 




1310 






1315 








1320 


Leu Tyr Gin 


Lys Ala 


Leu Glu 


Leu 


Gly Val 


Asp 


Asn 


Glu 


Thr Leu 




1325 






1330 








1335 


Ala Leu Ala 


Leu Ser 


Tyr His 


Glu 


Lys Ala 


Lys 


Glu 


Tyr 


Tyr Glu 




1340 






1345 








1350 


Lys Ala Leu 


Glu Leu 


Ser Glu 


Gly 


Asn He 


He 


Gin 


Tyr 


Leu Gly 




1355 






1360 






1365 


Asp He Arg 


Leu Leu 


Pro Pro 


Leu 


Arg Gin 


Ala 


Tyr 


He 


Asn Glu 




1370 






1375 








1380 


Met Lys Ala 


Val Lys 


He Leu 


Glu 


Lys Ala 


He 


Glu 


Glu 


Leu Glu 




1385 






1390 








1395 



Gly Glu Glu 

(2) INFORMATION FOR SEQ ID NO: 9 

(1) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 35 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid (synthetic DNA) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
GGWWSDRRTG TTRRHGTHGC DGTDMTYGAC ACBGG 35 

(2) INFORMATION FOR SEQ ID NO: 10 
<i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 32 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid (synthetic DNA) 
(xi). SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
KSTCACGGAA CTCACGTDGC BGGMACDGTT GC 32 

(2) INFORMATION FOR SEQ ID NO: 11 
(i) SEQUENCE CHARACTERISTICS 
(A) LENGTH: 33 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid (synthetic DNA) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
ASCMGCAACH GTKCCVGCHA CGTGAGTTCC GTG 33 

(2) INFORMATION FOR SEQ ID NO: 12 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 34 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: other nucleic acid (synthetic DNA) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
CHCCGSYVAC RTGBGGAGWD GCCATBGAVG TDCC 34 



10 



20 



(2) INFORMATION FOR SEQ ID NO: 13 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 145 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid (PCR fragment) 
25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

A GTT GCG GTA ATT GAC ACG GGT ATA GAC GCG AAC CAC CCC GAT CTG 46 
Val Ala Val He Asp Thr Gly He Asp Ala Asn His Pro Asp Leu 
5 10 15 

AAG GGC AAG GTC ATA GGC TGG TAC GAC GCC GTC AAC GGC AGG TCG 91 
30 Lys Gly Lys Val . lie Gly Trp Tyr Asp Ala Val Asn Gly Arg Ser 

20 . .. 25 30 

ACC CCC TAC GAT GAC CAG GGA CAC GGA ACT CAC GTN GCN GGA ACN 136 
Thr Pro Tyr Asp Asp Gin Gly His Gly Thr His Val Ala Gly Thr 
35 40 45 

GTT GCT GGT 145 
Val Ala Gly 



35 



40 



(2) INFORMATION FOR SEQ ID NO: 14 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 564 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid (PCR fragment) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

45 TCT CAC GGA ACT CAC GTG GCG GGA ACA GTT GCC GGA ACA GGC AGC 45 
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10 



15 



20 



25 



30 



35 



40 



45 



Ser His Gly 

GTT AAC TCC 
Val Asn Ser 

GGT GTC AAG 
Gly Val Lys 

ATC ATC GCG 
He lie Ala 

GGG ATA AGG 
Gly lie Arg 

GAC GGA GCC 
Asp Gly Ala 

GCC GGT ATA 
Ala Gly He 

ACC TAC ACC 
Thr Tyr Thr 

GTC GGT GCA 
Val Gly Ala 

AGG GGA CCG 
Arg Gly Pro 

CCC GGC GTT 
Pro Gly Val 

GGC ACC CCG 
Gly Thr Pro 

ATG GCC ACT 
Met Ala Thr 



Thr His 
5 

CAG TAC 
Gin Tyr 
20 

GTT CTC 
Val Leu 
. 35 
GGT GTT . 
Gly Val 
50 

GTC ATC 
Val He 
65 

GAC TCC 
Asp Ser 
80 

GTA GTC 
Val Val 
95 

GTC GGC 
Val Gly 

110 
GTT GAC 
Val Asp 

125 
ACC GCG 
Thr Ala 

140 
GAC ATC 
Asp He 

155 
ATA AAC 
He Asn 

170 
CCC CAT 
Pro His 

185 



Val Ala Gly Thr 



ATA GGC 
lie Gly 

GGT GCC 
Gly Ala 

GAC TGG 
Asp Trp 

AAC CTC 
Asn Leu 

CTC AGT 
Leu Ser 

TGC GTC 
Cys Val 

TCA CCC 
Ser Pro 

AGC AAC 
Ser Asn 

GAC GGA 
Asp Gly 

ATA GCC 
He Ala 

GAC TAC 
Asp Tyr 

GTT ACC 
Val Thr 



GTC GCC 
Val Ala 

GAC GGT 
Asp Gly 

GTC GTC 
Val Val 

TCC CTC 
Ser Leu 

CAG GCC 
Gin Ala 

GCC GCC 
Ala Ala 

GCC GCC 
Ala Ala 

GAC AAC 
Asp Asn 

AGG CTC 
Arg Leu 

CCG CGC 
Pro Arg 

TAC ACC 
Tyr Thr 

GGT 
Gly 



Val Ala 

10 
CCC GGC 
Pro Gly 

25 
TCG GGA 
Ser Gly 

40 
CAG AAC 
Gin Asn 

55 
GGC TCC 
Gly Ser 

70 
GTC AAC 
Val Asn 

85 
GGC AAC 
Gly Asn 
100 

GCG AGC 
Ala Ser 
115 

ATC GCC 
He Ala 
130 

AAG CCG 
Lys Pro 
145 

GCC AGC 
Ala Ser 
160 

AAG GCC 
Lys Ala 
175 



Gly Thr Gly 

GCG AAG CTC 
Ala Lys Leu 

AGC GTC TCC 
Ser Val Ser 

AAG GAT AAG. 
Lys Asp Lys 

TCC CAG AGC 
Ser Gin Ser 

AAC GCC TGG 
Asn Ala Trp 

AGC GGG CCG 
Ser Gly Pro 

AAG GTC ATA 
Lys Val lie 

AGC TTC TCC 
Ser Phe Ser 

GAA GTC GTC 
Glu Val Val 

GGA ACC AGC 
Gly Thr Ser 

TCT GGA ACC 
Ser Gly Thr 



Ser 

15 
GTC 
Val 

30 
ACC 
Thr 

45 
TAC 
Tyr 

60 
TCC 
Ser 

75 
GAC 
Asp 

90 
AAC 
Asn 
105 
ACC 
Thr 
120 
AGC 
Ser 
135 
GCC 
Ala 
150 
ATG 
Met 
165 
TCA 
Ser 

180 



90 
135 
180 
225 
270 
315 
360 
405 
450 
495 
540 
564 



(2) INFORMATION FOR SEQ ID NO: 15 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 1859 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 



so 



55 
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(vi) ORIGINAL SOURCE: Thermococcus celer 
(B) STRAIN: DSM2476 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

GAGCTCCGAC GGAACCGACT CCCTCAGTCA GGCCGTCAAC AACGCCTGGG ACGCCGGTAT 60 

AGTAGTCTGC GTCGCCGCCG GCAACAGCGG GCCGAACACC TACACCGTCG GCTCACCCGC 120 

CGCCGCGAGC AAGGTCATAA CCGTCGGTGC • AGTTGACAGC AACGACAACA TCGCCAGCTT 180 

CTCCAGCAGG GGACCGACCG CGGACGGAAG GCTCAAGCCG GAAGTCGTCG CCCCCGGCGT 240 

TGACATCATA GCCCCGCGCG CCAGCGGAAC CAGCATGGGC ACCCCGATAA ACGACTACTA 300 

CACCAAGGCC TCTGGAACCA GCATGGCCAC CCCGCACGTT TCGGGCGTTG GCGCGCTCAT 360 

CCTCCAGGCC CACCCGAGCT GGACCCCGGA CAAGGTGAAG ACCGCCCTCA TCGAGACCGC 420 

CGACATAGTC GCCCCCAAGG AGATAGCGGA CATCGCCTAC GGTGCGGGTA GGGTGAACGT 480 

CTACAAGGCC ATCAAGTACG ACGACTACGC CAAGCTCACC TTCACCGGCT CCGTCGCCGA 540 

CAAGGGAAGC GCCACCCACA CCTTCGACGT CAGCGGCGCC ACCTTCGTGA CCGCCACCCT 600 

CTACTGGGAC ACGGGCTCGA GCGACATCGA CCTCTACCTC TACGACCCCA ACGGGAACGA 660 

GGTTGACTAC TCCTACACCG CCTACTACGG CTTCGAGAAG GTCGGCTACT ACAACCCGAC 720 

CGCCGGAACC TGGACGGTCA AGGTCGTCAG CTACAAGGGC GCGGCGAACT ACCAGGTCGA 780 

CGTCGTCAGC GACGGGAGCC TCAGCCAGTC CGGCGGCGGC AACCCGAATC CAAACCCCAA 840 

CCCGAACCCA ACCCCGACCA CCGACACCCA GACCTTCACC GGTTCCGTTA ACGAC TACTG 900 

GGACACCAGC GACACCTTCA CCATGAACGT CAACAGCGGT GCCACCAAGA TAACCGGTGA 960 

CCTGACCTTC GATACTTCCT ACAACGACCT CGACCTCTAC CTCTACGACC CCAACGGCAA 1020 

CCTCGTTGAC AGGTCCACGT CGAGCAACAG CTACGAGCAC GTCGAGTACG CCAACCCCGC 1080 

CCCGGGAACC TGGACGTTCC TCGTCTACGC CTACAGCACC TACGGCTGGG CGGACTACCA 1140 

GCTCAAGGCC GTCGTCTACT ACGGGTGAAG GTTTTTAATC CCCTTTTCTT TCCCCTTTTG 1200 

AGGTGGTTGG GATGAAGCGG GTTCTTGCGG CGATCCTTGT AATCATGCTC ATCGGATTAT 1260 

CATTCCCTGC CGGAAGTGCT AAAATCGAGC CCTACGTTTA CAGCCCCACC GTTCCGGATA 1320 

CCGCCTTCGC GGTTCTCACC CTGTACAGGA CCGGGGACTA CGCCCGGGTT CTCGAGGGAT 1380 

ACGAGTGGCT CCTCCAGATG AGAACTCCCA TCGATTCGTG GGGGGTTTCC CGCGGGGAAA 1440 

CGCACATGGC CAAGTACACG GCAATGGCGA TGCTGGCCCT CATGCGCGGC GAGAACGTGG 1500 

CGAGGGGGCG TTACAGGGAT GTTCTCAACG ACGCCGCGTA CTGGTTAATA TACAAACAGA 1560 

ACCCGGACGG CTCGTGGGAG GACTACACCG GAACGGCGCT GGCCGTCATC GCGCTCGGGG 1620 

AGTTCCTTAA GGGCGGGTAC ATCAACGCGA ACCTGACCGG CTTCAAAAAG CAGGTTAAAG 1680 

AGGCCGTAAA CCGCGGGGAA GGCTGGCTGA TGGATGCGGA CCCAAAAACG GACGCGGATA 1740. 

GAATATTCGG CTACCTCGCC CTCGGTAAAA AGGACGAACT CAAAAAGATG AACCCTTCCG 1800 

GTGACCTGAA GGCCTACCGC GCCTTTGCAC TTGCCTACCT CGGGGAGAGG GTCGAGCTCUO 1859 

(2) INFORMATION FOR SEQ ID NO: 16 

(1) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 20 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid (synthetic DNA) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
TGTAGTAGTC GTTTATCGGG 20 

(2) INFORMATION FOR SEQ ID NO: 17 
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10 



15 



20 



25 



30 



(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 1464 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 



17: 



AAGCTTAACA TCGAGCGCTC CACCTCTAAA GTAGGTGAGT GTGGATACGA AGGTTAGGGC 60 

CGCTATGACG ACCTTCAGGA TCCCAACGGC TTCTTTTATG GGGAGCCCGG CGAAGGTGAG 120 

AATTGAAAGG ATTACCATAC TCCCTCCGCT CATCATGGAG CCTATGAATC CCCCTCCAAA 180 

AGAGAGAAGT GCTATAAGGA GCGTCCTCAT GTTCCATGCT ATGTTTTGGT ATTTAATGCT 240 

TTTCCGCTTA ATGTTACACC TCCTCATGAC AATTTCGCGT TTAGGGATGG GGTTAATTGG 300 

ACCCCTCCGA GCCACGGGTT GATGTCCATT ATGTCGATAT TCACCATCTT ATCCCCAACT 360 

TTGTGGGTTT CAAACATTAC CCTACGTTAT ATTTTTATCG TCCTAATTAA CTGCTGAAAC 420 

GGGCGCTTAT CGTTCATCGT TGATGGTTTT GGGTGACCGG GCATTAAGGA ATTGTGTCGT 480 

TTGCTGAAAT TTATGAAACG GAGTTGGCTT CTTTATGTTA CATAAAGATG TACATTACTG 540 

TAATGTATAT AAATGGAAGA AACACTGTTG CGTAAACTTT TTAATGTATC CAATATCAGT 600 

ACTTCGATGT CCCGATATGG GACATGTTGG ATAGGAGGGT ACTGGAATGA AGAGGTTAGG 660 

TGCTGTGGTG CTGGCACTGG TGCTCGTGGG TCTTCTGGCC GGAACGGCCC TTGCGGCACC 720 

CGTAAAACCG GTTGTCAGGA ACAACGCGGT TCAGCAGAAG AACTACGGAC TGCTGACCCC 780 

GGGACTGTTC AAGAAAGTCC AGAGGATGAA CTGGAACCAG GAAGTGGACA CCGTCATAAT 840 

GTTCGGGAGC TACGGAGACA GGGACAGGGC GGTTAAGGTA CTGAGGCTCA TGGGCGCCCA 900 

GGTCAAGTAC TCCTACAAGA TAATCCCTGC TGTCGCGGTT AAAATAAAGG CCAGGGACCT 960 

TCTGCTGATC GCGGGCATGA TAGACACGGG TTACTTCGGT AACACAAGGG TCTCGGGCAT 1020 

AAAGTTCATA CAGGAGGATT ACAAGGTTCA GGTTGACGAC GCCACTTCCG TCTCCCAGAT 1080 

AGGGGCCGAT ACCGTCTGGA ACTCCCTCGG CTACGACGGA AGCGGTGTGG TGGTTGCCAT 1140 

CGTCGATACG GGTATAGACG CGAACCACCC CGATCTGAAG GGCAAGGTCA TAGGCTGGTA 1200 

CGACTCCGTC AACGGCAGGT CGACCCCCTA CGATGACCAG GGACACGGAA CCCACGTTGC 1260 

GGGTATCGTT GCCGGAACCG GGAGCGTTAA CTCCCAGTAC ATAGGCGTCG GCCCCGGCGC 1320 

GAAGCTCGTC GGCGTCAAGG TTCTCGGTTC CGACGGTTCG GGAAGCGTCT CCACCATCAT 1380 

CGCGGGTGTT GACTGGAACG TCCAGAACTA GGACAAGTAC GGGATAAGGG TCATCAACCT 1440 

CTCCCTCGGC TCCTCCCAGA GCTC 1464 



35 



40 



45 



(2) INFORMATION FOR SEQ ID NO: 18 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 33 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid (synthetic DNA) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
AAAAGAATTC GGATCCATGA AGAGGTTAGG TGC 

(2) INFORMATION FOR SEQ ID NO: 19 
(i) SEQUENCE CHARACTERISTICS 
(A) LENGTH: 28 



18: 



33 



50 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid (synthetic DNA) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
TTTTATCGAT CAGGCGTCCC AGGCGTTG 28 

(2) INFORMATION FOR SEQ ID NO: 20 

(1) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 22 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid (synthetic DNA) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
CATTATAGGT AAGAGAGGAA TG 22 

(2) INFORMATION FOR SEQ ID NO: 21 

(1) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 30 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid (synthetic DNA) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
GATCCATTCC TCTCTTACCT ATAATGGTAC 30 

(2) INFORMATION FOR SEQ ID NO: 22 

(1) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 19 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid (synthetic DNA) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
TAGCAGTAAT TGACACGGG 19 

(2) INFORMATION FOR SEQ ID NO: 23 

(i) SEQUENCE CHARACTERISTICS: 
. (A) LENGTH: 19 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid (synthetic DNA) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
TAGCAGTAAT TGACACTGG 19 
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30 



35 



(2) INFORMATION FOR . SEQ ID NO: 24 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 22 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid (synthetic DNA) 
{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
CTGTTCCAGC TACGTGAGTT CC 22 



(2) INFORMATION FOR SEQ ID NO: 25 

(1) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 22 

75 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid (synthetic DNA) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
20 CTGTTCCAGC TACATGAGTT CC 22 

(2) INFORMATION FOR SEQ ID NO: 26 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 507 

25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: Pyrococcus furiosus 

(B) STRAIN: DSM3638 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

A CTA GTC ATC TCA GGT TTA ACA GGG GGT AAA GCT AAG.CTT TCA GGT . . . . 46 
Leu Val He Ser Gly Leu Thr Gly Gly Lys Ala Lys Leu Ser Gly 
5 10 15 

GTT AGG TTT ATC CAG GAA GAC TAC AAA GTT ACA GTT TCA GCA GAA 91 
Val Arg Phe He Gin Glu Asp Tyr Lys Val Thr Val Ser Ala Glu 
20 25 30 

TTA GAA GGA CTG GAT GAG TCT GCA GCT CAA GTT ATG GCA ACT TAC 136 
Leu Glu Gly Leu Asp Glu Ser Ala Ala Gin Val Met Ala Thr Tyr 
35 40 45 

GTT TGG AAC TTG GGA TAT GAT GGT TCT GGA ATC ACA ATA GGA ATA 181 
Val Trp Asn Leu Gly Tyr Asp Gly Ser Gly He Thr He Gly He 60 

50 55 60 

ATT GAC ACT GGA ATT GAC GCT TCT CAT CCA GAT CTC CAA GGA AAA 226 
He Asp Thr Gly He Asp Ala Ser His Pro Asp Leu Gin Gly Lys 75 
45 . . 65 70 75 



so 
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GTA ATT 


GGG 


TGG 


GTA GAT 


TTT GTC AAT 


GGT 


AGG 


AGT 


TAT 


CCA TAC 


271 


Val He 


Gly 


Trp 


Val Asp 


Phe Val Asn 


Gly 


Arg 


Ser 


Tyr 


Pro Tyr 


90 








80 




85 








90 




GAT GAC 


CAT 


GGA 


CAT GGA 


ACT CAT GTA 


GCT 


TCA 


ATA 


GCA 


GCT GGT 


316 


Asp Asp 


His 


Gly 


His Gly 


Thr His Val 


Ala 


Ser 


He 


Ala 


Ala Gly 


105 








95 




100 








105 




ACT GGA 


GCA 


GCA 


AGT AAT 


GGC AAG TAC 


AAG 


GGA 


ATG 


GCT 


CCA GGA 


361 


Thr Gly 


Ala 


Ala. 


Ser Asn 


Gly. Lys Tyr 


Lys 


Gly 


Met 


Ala 


Pro Gly 


120 






110 


115 








120 




GCT AAG 


CTG 


GCG 


GGA ATT. 


AAG GTT CTA 


GGT 


GCC 


GAT 


GGT 


TCT GGA 


406 


Ala Lys 


Leu 


Ala 


Gly He 


Lys Val Leu 


Gly 


Ala 


Asp 


Gly 


Ser Gly 


135 








125 




130 








135 




AGC ATA 


TCT 


ACT 


ATA ATT 


AAG GGA GTT 


GAG 


TGG 


GCC 


GTT 


GAT AAC 


451 


Ser He 


Ser 


Thr 


He -He 


Lys Gly Val 


Glu 


Trp 


Ala 


Val 


Asp Asn 


150 








140 




145 








150 




AAA GAT 


AAG 


TAG 


GGA ATT 


AAG GTC ATT 


AAT 


CTT 


TCT 


CTT 


GGT TCA 


496 


Lys Asp 


Lys 


Tyr 


Gly He 


Lys Val He 


Asn 


Leu 


Ser 


Leu 


Gly Ser 


165 








155 




160 








165. 




AGC CAG 


AGC 


TC 
















507 


Ser Gin 


Ser 


















168 



(2) INFORMATION FOR SEQ ID NO: 27 

(1) SEQUENCE CHARACTERISTICS 
25 (A) LENGTH: 30 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid (synthetic DNA) 
30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

TGACACTGGA ATTGACGCTT CTCATCCAGA 30 .. 

(2) INFORMATION FOR SEQ ID NO: 28 

(i) SEQUENCE CHARACTERISTICS 
35 (A) LENGTH: 30 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid (synthetic DNA) 
4 o (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

TCTCCAAGGA AAAGTAATTG GGTGGGTAGA 30 



(2) INFORMATION FOR SEQ ID NO: 29 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 30 . 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

5 (ii) MOLECULE TYPE: other nucleic acid (synthetic DNA) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
GTTGCCATAA CTTGAGCTGC AGACTCATCC 30 

(2) INFORMATION FOR SEQ ID NO: 30 

(i) SEQUENCE CHARACTERISTICS 
10 (A) LENGTH: 420. 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid (PCR fragment) 
is (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

TTTATTAAGC ATAAAATAGC CATGCAACTT TGATCACTAA TGTGCGGTGG TGCAC ATG 59 

Met 

AAG GGG CTG AAA GCT CTC ATA TTA GTG ATT TTA GTT CTA GGT TTG 104 
Lys Gly Leu Lys Ala Leu lie Leu Val lie Leu Val Leu Gly Leu 

5 10 15 

GTA GTA GGG AGC GTA GCG GCA GCT CCA GAG AAG AAA GTT GTT CAA 149 
Val Vai Gly Ser Val Ala Ala Ala Pro Glu Lys Lys Val Val Gin 

20 25 30 

GTA AGA AAT GTT GAG AAG AAC TAT GGT CTG CTA ACG CCA GGA CTG 194 
Val Arg Asn Val Glu Lys Asn Tyr Gly Leu Leu Thr Pro Gly Leu 
25 35 40 45 

TTC AGA AAA ATT CCC AAA TTG GAT CCT AAC GAG GGA ATC AGC ACA 239 
Phe Arg Lys lie Pro Lys Leu Asp Pro Asn Glu Gly lie Ser Thr 

50 55 " 60 

GTA ATT GTA TTT GTT AAC CAT AGG GGA AAA GAA ATT GCA GTA AGA 284 
Val He Val Phe Val Asn His Arg Gly Lys Glu He Ala. Val Arg 

65 70 75 

GTT CTT GAG TTA ATG GGT GCC CAA GTT AGG TAT GTG TAC CAT ATT 329 
Val Leu Glu Leu Met Gly Ala Gin Val Arg Tyr Val Tyr His He 

80 85 90 

ATA CCC CCA ATA GCT GCC GAT CTT AAG GTT AGA GAC TTA CTA GTC 374 
35 lie Pro Pro He Ala Ala Asp Leu Lys Val Arg Asp Leu Leu Val 

95 100 105 

ATC TCA GGT TTA ACA GGG GGT GAA ACT AAG CTT TCA GGT GTT AGG T 420 
He Ser Gly Leu Thr Gly Gly Glu Thr Lys Leu Ser Gly Val Arg 
110 115 120 



30 



40 



45 



(2) INFORMATION FOR SEQ ID NO: 31 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 180 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : . double . 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid (PCR fragment) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

GCTCTAGACT CTGGGAGGAG TAGTTATACT TGATGAAGCC TATTCTGAGT TTTCGGGAAA 60 
AAGCTTCATA CCAAAAATCA GTGAGTATGA AAATTTAGTA ATTCTAAGGA CGTTTTCAAA 120 
GGCGTTTGGA CTTGCTGGAA TTAGATGTGG ATATATGATA GCAAATGAAA AGATTATAGA 180 

(2) INFORMATION FOR SEQ ID NO: 32 

(1) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 28 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid (synthetic DNA) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
AGAGGGATCC ATGAAGGGGC TGAAAGCT 28 

(2) INFORMATION FOR SEQ ID NO: 33 

(1) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 28 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid (synthetic DNA) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
AGAGGCATGC GCTCTAGACT " CTGGGAGAGT 28 

(2) INFORMATION FOR SEQ ID NO: 34 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 1962 

(B) TYPE: nucleic acid . 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: Pyrococcus furiosus 
(B) STRAIN: DSM3638 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

ATGAAGGGGC TGAAAGCTCT CATATTAGTG ATTTTAGTTC TAGGTTTGGT AGTAGGGAGC 60 

GTAGCGGCAG CTCCAGAGAA GAAAGTTGAA CAAGTAAGAA ATGTTGAGAA GAACTATGGT 120 

CTGCTAACGC CAGGACTGTT CAGAAAAATT CAAAAATTGA ATCCTAACGA GGAAATCAGC 180 

ACAGTAATTG TATTTGAAAA CCATAGGGAA AAAGAAATTG CAGTAAGAGT TCTTGAGTTA 240 

ATGGGTGCAA AAGTTAGGTA TGTGTACCAT ATTATACCCG CAATAGCTGC CGATCTTAAG 300 

GTTAGAGACT TACTAGTCAT CTCAGGTTTA ACAGGGGGTA AAGCTAAGCT TTCAGGTGTT 360 

AGGTTTATCC AGGAAGACTA CAAAGTTACA GTTTCAGCAG AATTAGAAGG ACTGGATGAG 420 

TCTGCAGCTC AAGTTATGGC AACTTACGTT TGGAACTTGG GATATGATGG TTCTGGAATC 480 

ACAATAGGAA TAATTGACAC TGGAATTGAC GCTTCTCATC CAGATCTCCA AGGAAAAGTA 540 
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ATTGGGTGGG TAGATTTTGT CAATGGTAGG AGTTATCCAT ACGATGACCA TGGACATGGA 600 
ACTCATGTAG CTTCAATAGC AGCTGGTACT GGAGCAGCAA GTAATGGCAA GTACAAGGGA 660 
ATGGCTCCAG GAGCTAAGCT GGCGGGAATT AAGGTTCTAG GTGCCGATGG TTCTGGAAGC 720 
ATATCTACTA TAATTAAGGG AGTTGAGTGG GCCGTTGATA ACAAAGATAA GTACGGAATT 780 
AAGGTCATTA ATCTTTCTCT TGGTTCAAGC CAGAGCTCAG ATGGTACTGA CGCTCTAAGT 840 
CAGGCTGTTA ATGCAGCGTG GGATGCTGGA TTAGTTGTTG TGGTTGCCGC TGGAAACAGT 900 
GGACCTAACA AGTATACAAT CGGTTCTCCA GCAGCTGCAA GCAAAGTTAT TACAGTTGGA 960 
GCCGTTGACA AGTATGATGT TATAACAAGC TTCTCAAGCA ,GAGGGCCAAC TGCAGACGGC 1020 
. AGGCTTAAGC CTGAGGTTGT TGCTCCAGGA AACTGGATAA TTGCTGCCAG AGCAAGTGGA 1080 
ACTAGCATGG GTCAACCAAT TAATGACTAT TACACAGCAG CTCCTGGGAC ATCAATGGCA 1140 
ACTCCTCACG TAGCTGGTAT TGCAGCCCTC TTGCTCCAAG CACACCCGAG CTGGACTCCA 1200 
GACAAAGTAA AAACAGCCCT CATAGAAACT GCTGATATCG TAAAGCCAGA TGAAATAGCC 1260 
GATATAGCCT ACGGTGCAGG TAGGGTTAAT GCATACAAGG CTATAAACTA CGATAACTAT 1320 
15 GCAAAGCTAG TGTTCACTGG ATATGTTGCC AACAAAGGCA GCCAAACTCA CCAGTTCGTT 1380 

ATTAGCGGAG CTTCGTTCGT AACTGCCACA TTATACTGGG ACAATGCCAA TAGCGACCTT 1440 
GATCTTTACC TCTACGATCC CAATGGAAAC CAGGTTGACT ACTCTTACAC CGCCTACTAT 1500 
GGATTCGAAA AGGTTGGTTA TTACAACCCA ACTGATGGAA CATGGACAAT TAAGGTTGTA 1560 
AGCTACAGCG GAAGTGCAAA CTATCAAGTA GATGTGGTAA GTGATGGTTC CCTTTCACAG 1620 
20 CCTGGAAGTT CACCATCTCC ACAACCAGAA CCAACAGTAG ACGCAAAGAC GTTCCAAGGA 1680 

TCCGATCACT ACTACTATGA CAGGAGCGAC ACCTTTACAA TGACCGTTAA CTCTGGGGCT 1740 
ACAAAGATTA CTGGAGACCT AGTGTTTGAC ACAAGCTACC ATGATCTTGA CCTTTACCTC 1800 
TACGATCCTA ACCAGAAGCT TGTAGATAGA TCGGAGAGTC CCAACAGCTA CGAACACGTA 1860 
GAATACTTAA CCCCCGCCCC AGGAACCTGG TACTTCCTAG TATATGCCTA CTACACTTAC 1920 
GGTTGGGCTT ACTACGAGCT GACGGCTAAA GTTTATTATG GC 1962 

2S 

(2) INFORMATION FOR SEQ ID NO: 35 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 654 

(B) TYPE: amino acid 

30 (C) STRANDEDNESS : single 

(p) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

Met Lys Gly Leu Lys Ala Leu lie Leu Val lie Leu Val Leu Gly 

5 10 15 

Leu Val Val Gly Ser Val Ala Ala Ala Pro Glu Lys Lys Val Glu 
20 25 30 

Gin Val Arg Asn Val Glu Lys Asn Tyr Gly Leu Leu Thr Pro Gly 
35 40 45 

Leu Phe Arg Lys lie Gin Lys Leu Asn Pro Asn Glu Glu lie Ser 
40 50 55 60 

Thr Val lie Val Phe Glu Asn His Arg Glu Lys Glu He Ala Val 
65 70 75 

Arg Val Leu Glu Leu Met Gly Ala Lys Val Arg Tyr Val Tyr His 
-80 85 90 

45 He He Pro Ala lie Ala Ala Asp Leu Lys Val Arg Asp Leu Leu 
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Val He Ser 
Arg Phe He 
Glu Gly Leu 
Trp Asn Leu 
Asp Thr Gly 
He Gly Trp 
Asp His Gly 
Gly Ala Ala 
Lys Leu Ala 
He Ser Thr 
Asp Lys Tyr 
Gin Ser Ser 
Ala Trp Asp 
Gly Pro Asn 
Val He Thr 
Phe Ser Ser 
Val Val Ala 
Thr Ser Met 
Gly Thr Ser 
Leu Leu Gin 
Ala Leu He 
Asp lie Ala 
Asn Tyr Asp 



95 

Gly Leu Thr 

110 
Gin Glu Asp 

125 
Asp Glu Ser 

140 
Gly Tyr Asp 

155 . 
He Asp Ala 

170 
Val Asp Phe 

185 
His Gly Thr 

200 
Ser Asn Gly 

215 
Gly He Lys 

230 
He He Lys 

245 
Gly lie Lys 

260 
Asp Gly Thr 

275 
Ala Gly Leu 

290 
Lys Tyr Thr 

305 
Val Gly Ala 

320 
Arg Gly Pro 

335 
Pro Gly Asn 

350 
Gly Gin Pro 

365 
Met Ala Thr 

380 
Ala His Pro 

395 -', 
Glu Thr Ala 

410 
Tyr Gly Ala 

425 
Asn Tyr Ala 



Gly Gly Lys 
Tyr Lys Val 
Ala Ala Gin 
Gly Ser Gly 
Ser His Pro 
Val Asn Gly 
His Val Ala 
Lys Tyr Lys 
Val Leu Gly 
Gly Val. Glu 
Val He Asn 
Asp Ala Leu 
Val Val Val 
lie Gly Ser 
Val Asp Lys 
Thr Ala Asp 
Trp He He 
He Asn Asp 
Pro His Val 
Ser Trp Thr 
Asp lie Val 
Gly Arg Val 
Lys Leu Val 



100 






Ala 






115 






X 111. 


Val 
v ai 




130 






Val 


Met 


Ala 


145 






He 


Thr 


He 


160 






A^n 


Lpu 


Gin 


175 








OCX 


iyr 


190 

i _? KJ 






vJCi 


Tl p 

lie 


Ala 








Gl v 




AT a 
Aid 


220 






Ala 


A^D 




235 






xxp 


AT a 


Val 

V CL-L. 


250 

fa JU 






T.an 
XiC U. 


Cor 


T Ql i 

ajcU 


265 








Gin 


Ala 

Aid 


280 






Val 


Ala 


Ala 


295 






Pro 


Ala 


Ala 


310 






Tvr 
j 


Asp 


Val 


325 






Glv 


Arrr 


Tipn 

JJCUl 


340 






Ala 


Ala 


Arg 


355 






Tvr 


Tvr 


Thr 


370 






Ala 


Glv 


He 


385 






Pro 


Asp 


Lys 


400 






Lys 


Pro 


Asp 


415 






Asn 


Ala 


Tyr 


430 






Phe 


Thr 


Gly 







105 


OCX 


Gl v 


Val 
vai 






1 90 

X^. w 


Ala 


Gl n 


Lsu 






135 


Thr 


A y r 


Va 1 
vai 






150 

X *J 


vjiy 


Tl p 
lie 


Tip 
iic 






1 fiS 

i UJ 


Glv 


Xi_y £> 


Va 1 
v ai 






180 




iyr 


Asp 






i jO 


Ala 


vjxy 


inr 






Z X u 


JrxO 


uiy 


x*xi a. 






99 


Cor 

OCX 


Gl \7 


^ or* 
OCX 








A en 




xiy o 






9S5 

£» OO 


vsxy 


C! a >* 

oex 


So.r 






9*7H 


V ai 


xioli 


Ala 






9ft R 


Gl v 


A CTl 


CoY" 






300 


Ala 
ni ex 


OCX 








31 s 

<•/ X 


He 


Thr 

XX XX. 


Car 

w CI 






330 


T t/o 

Xiys 


JrxO 


oiu 






345 


Ala 


Cor 
Oct 


Gl v 






360 


Ala 


Ala 


Pro 

XT X, ^ 






375 


Ala 


Ala 


Lsii 






390 


Val 


Lys 


Thr 






405 


Glu 


He 


Ala 






420 


Lys 


Ala 


He 






435 


Tyr 


Val 


Ala 
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440 










445 










450 


Asn 


Lys 


Gly 


Ser 


Gin 


Thr 


His 


Gin 


Phe 


Val 


He 


Ser 


Gly 


Ala 


Ser 










455 










460 








465 


Phe 


Val 


Thr 


Ala 


Thr 


Leu 


Tyr Trp 


Asp 


Asn 


Ala 


Asn 


Ser 


Asp 


Leu 










470 










475 








480 


Asp 


Leu 


Tyr 


Leu 


Tyr 


Asp 


Pro 


Asn 


Gly 


Asn 


Gin 


Val 


Asp 


Tyr 


Ser 










485 










490 






495 


Tyr 


Thr 


Ala 


Tyr 


Tyr 


Gly 


Phe 


Glu 


Lys 


Val 


Gly 


Tyr 


Tyr 


Asn 


Pro 


Thr 


Asp 






500 










505 






510 


Gly 


Thr 


Trp 


Thr 


He 


Lys 


Val 


Val 


Ser 


Tyr 


Ser 


Gly 


Ser 


Ala 








515 










520 






525 


Asn 


Tyr 


Gin 


Val 


Asp 


Val 


Val 


Ser 


Asp 


Gly 


Ser 


Leu 


Ser 


Gin 


Pro 


Gly 






530 










535 








540 


Ser 


Ser 


Pro 


Ser 


Pro 


Gin 


Pro 


GlU 


Pro 


Thr 


Val 


Asp 


Ala 


Lys 








545 










550 








555 


Thr 


Phe 


Gin 


Gly 


Ser 


Asp 


His 


Tyr 


Tyr 


Tyr 


Asp 


Arg 


Ser 


Asp 


Thr 








560 










565 










570 


Phe 


Thr 


Met 


Thr 


Val 


Asn 


Ser 


Gly 


Ala 


Thr 


Lys 


He 


Thr 


Gly 


Asp 








575 










580 








585 


Leu 


Val 


Phe 


Asp 


Thr 


Ser 


Tyr 


His 


Asp 


Leu 


Asp 


Leu 


Tyr 


Leu 










590 










595 








600 


Tyr 


Asp 


Pro 


Asn 


Gin 


Lys 


Leu 


Val 


Asp 


Arg 


Ser 


Glu 


Ser 


Pro 


Asn 










605 










DJ.U 










615 


Ser 


Tyr 


Glu 


His 


Val 


Glu 


Tyr 


Leu 


Thr 


Pro 


Ala 


Pro 


Gly 


Thr 


Trp 


Tyr 








620 










625 








630 


Phe 


Leu 


Val 


Tyr 


Ala 


Tyr 


Tyr 


Thr 


Tyr 


Gly 


Trp 


Ala 


Tyr 


Tyr 


Glu 








635 










640 






645 


Leu 


Thr 


Ala 


Lys 


Val 


Tyr 


Tyr 


Gly 




















650 




















(2) 


INFORMATION 


FOR 


SEQ 


ID NO: 36 















(1) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 25 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid (synthetic DNA) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
TCTGAATTCG TTCTTTTCTG TATGG 25 

(2) INFORMATION FOR SEQ ID NO: 37 
(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 20 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: other nucleic acid {synthetic DNA) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
TGTACTGCTG GATCCGGCAG 20 

(2) INFORMATION FOR SEQ ID NO: 38 

(1) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 80 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: Pyrococcus furiosus 

(B) STRAIN: DSM3638 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

GGATCCATCA GATTTTTGAG TGTAGATCAA CCAGTATGCT GCATTTGTAA TTGTGAGATA 
ATATCTCCCG CGGGTAAGGT 

(2) INFORMATION FOR SEQ ID NO: 39 

(1) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 30 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid (synthetic DNA) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
AGAGGCATGC GTATCCATCA GATTTTTGAG 30 

(2) INFORMATION FOR SEQ ID NO: 40 

(1) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 20 . 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid (synthetic DNA) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 
AGTGAACGGA TACTTGGAAC 20 

(2) INFORMATION FOR SEQ ID NO: 41 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 20 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid (synthetic DNA) 
(xi) SEQUENCE DESCRIPTION:: SEQ ID NO: 41: 
GTTCCAAGTA TCCGTTCACT 20 
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(2) INFORMATION FOR SEQ ID NO: 42 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 12 

(B) TYPE : amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 
Ala Glu Leu Glu Gly Leu Asp Glu Ser Ala Ala Gin 

5 10 



Claims 

1 . A hyperthermostable protease having an amino acid sequence represented by SEQ ID No. 1 of the Sequence List- 
ing or functional" equivalents thereof. 

2. A hyperthermostable protease gene encoding the hyperthermostable protease or functional equivalents thereof as 
defined in claim 1 . 

3. The hyperthermostable protease gene according to claim 2, which has a nucleotide sequence represented by SEQ 
ID No. 2 of the Sequence Listing. 

4. The hyperthermostable protease gene according to claim 2, which hybridizes to the hyperthermostable protease 
gene as defined in claim 3. 

5. A hyperthermostable protease having an amino acid sequence represented by SEQ ID No. 3 of the Sequence List- 
ing or functional equivalents thereof. 

6. A hyperthermostable protease gene encoding the hyperthermostable protease or functional equivalents thereof as 
defined in claim 5. 

7. The hyperthermostable protease gene according to claim 6, which has an nucleotide sequence represented by 
SEQ ID No. 4 of the Sequence Listing. 

8. The hyperthermostable protease gene according to claim 6, which hybridizes to the hyperthermostable protease 
gene as defined in claim 7. 

9. A hyperthermostable protease having an amino acid sequence represented by SEQ ID No. 5 of the Sequence List- 
ing or functional equivalents thereof. 

10. A hyperthermostable protease gene encoding the hyperthermostable protease or functional equivalents thereof as 
defined in claim 9. 

11. The hyperthermostable protease gene according to claim 10, which has a nucleotide sequence represented by 
SEQ ID No. 6 of the Sequence Listing. 

12. The hyperthermostable protease gene according to claim 10, which hybridizes to the hyperthermostable protease 
gene according to claim 1 1 . 

13. A method for preparing a hyperthermostable protease which comprises culturing a transformant containing any 
one of hyperthermostable protease genes as defined in claims 2-4, 6-8 and 10-12, then harvesting a hyperther- 
mostable protease from the culture. 
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Fig. 1 
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Fig. 2 

170 175 180 

Asp Gly Ser Cly Val Val Val Ala Val Leu Asp Thr Gly Val 

5 ' -GAT GGT ACT GGT GTT GTT GTT GCA GTA CTT GAC ACG GGA GTT-3' 

PRO- IF 5'-GGW WSD RRT GTT RRH GTH CCD GTD MTY GAC ACB GG-3' 



Fig. 3 



365 370 375 

His Gly His Gly Thr His Val Ala Gly. Thr Val Ala Gly Tyr 
5'-CAC GCT CAC GGA ACT CAC GTA GCT GGA ACT GTT GCT GGT TAC-3' 



PR0-2F 5'-KST CAC GGA ACT CAC GTD GCB GGH ACD GTT GC-3' 
PR0-2R 3'-GTG CCT TGA GTG CAH CGV CCK TGH CAA CGH CSA-5' 



Fig. 4 

590 595 

Ser Gly Thr Ser Met Ala Thr Pro His Val Ser Gly Val Val 

5'-TCT GGA ACT TCG ATG GCT ACT CCA CAT GTC AGC GGT GTC GTT-3' 



PR0-4R 3* -CCD TGV AGB TAC CGD WGA GGB GTR CAV YSG CCH C-5' 
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Fig. 5 

PmaC I 



Fig. 6 
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Fig. 11 
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Fig. 11 (Cont'd) 
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Fig. 12 
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Fig. 1 2 (Cont'd) 
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Fig. 13 
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Fig- 1 6 
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Fig. 20 
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Fig. 3 1 



3. 2M urea 
6. 4M urea 
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